WO2021147282A1

WO2021147282A1 - Method, apparatus and device for detecting malicious file, and storage medium

Info

Publication number: WO2021147282A1
Application number: PCT/CN2020/104449
Authority: WO
Inventors: 郑宝龙; 陈甲
Original assignee: 华为技术有限公司
Priority date: 2020-01-20
Filing date: 2020-07-24
Publication date: 2021-07-29
Also published as: CN113139176A

Abstract

The present application relates to the technical field of computers, and provided therein are a method, apparatus and device for detecting a malicious file, and a storage medium. The present application provides a solution that may achieve cross-platform dynamic detection of a malicious file. The method comprises: generating a virtual operating environment on the basis of container technology; simulating an operating environment provided by an operating system compatible with a test file; after the test file has called an API provided by the virtual operating environment, converting the API called by the test file in the virtual operating environment to an API provided by an operating system of a detection device; and executing the converted API in the operating system of the detection device. By executing an API of an operating system of a detection device, the effect of simulating the execution of an API of a first operating system is thus achieved. Therefore, a virtual operating environment provided by the detection device may be compatible with the normal operation of a test file, thereby eliminating the dependence of the test file on a specific operating system, and thus achieving cross-platform detection of a malicious file.

Description

Malicious file detection method, device, equipment and storage medium

This application claims the priority of a Chinese patent application filed on January 20, 2020 with the application number 202010065766.3 and the invention title "Malicious file detection method, device, equipment and storage medium", the entire content of which is incorporated herein by reference Applying.

Technical field

This application relates to the field of computer technology, and in particular to a method, device, equipment and storage medium for detecting malicious files.

Background technique

A malicious file refers to a file containing a piece of program written by the programmer for attacking intent. Malicious files can spread on the network, and computers can receive malicious files from the network. When a computer runs a malicious file, it will perform malicious operations such as information theft, infection, or blackmail based on the program contained in the malicious file, which greatly affects the security of the network system. In view of this, the detection technology of malicious files has become a research hotspot in this field.

Dynamic behavior detection based on virtualized operating environment is a new type of malicious file detection technology. The advantage of this detection technology is that it can find unknown new malicious files. The basic principle of this technology is to use virtualization technology to generate a virtualized operating environment similar to the user host, such as a virtual machine. A hook function is set in the virtualized operating environment, and the hook function is used to intercept predetermined application programming interface (application programming interface, API) calls. Run the test file (that is, the file to be tested) in the virtualized operating environment, and obtain a series of API calls during the running process of the file to be tested through the hook function, so as to obtain the dynamic behavior of the test file. It is further judged whether the test file is a malicious file according to the dynamic behavior of the obtained test file.

The current detection technology has limitations in its application. For example, from the perspective of usage scenarios, the current virtualization technology can only be run in a testing device that supports the Microsoft Windows operating system (Windows) operating system, otherwise the test file cannot run and the test cannot be implemented.

Summary of the invention

The embodiments of the present application provide a method, device, device, and storage medium for detecting malicious files, which can solve the limitations of existing malicious file detection technologies to a certain extent.

In the first aspect, a method for detecting malicious files is provided. In this method, a detection device obtains a test file, and the test file is an executable file that runs based on a first operating system; the detection device is in a virtual operating environment Run the test file, the virtual running environment is generated based on container technology; the detection device obtains the first API sequence called by the test file during the running process, and the first API sequence includes at least one API, The APIs included in the first API sequence are APIs in a first API set, and the first API set includes multiple APIs required for software operation provided by the virtual operating environment, and the APIs in the first API set The identifier of is the same as the identifier of the API in the second API set. The second API set includes multiple APIs required by the software provided by the first operating system to run; the detection device executes the first operating system in the second operating system. Two API sequences, the second API sequence includes at least one API, the APIs included in the second API sequence are the APIs in the second operating system, and the first API in the second API sequence and the The first API in the first API sequence has a mapping relationship, and the second operating system is an operating system based on the computer instruction set architecture of the detection device; the detection device is based on the first API sequence in the process of being called. The behavior characteristics of the test file are described, and it is determined whether the test file is a malicious file.

The embodiment of the present application simulates the operating environment provided by the operating system compatible with the test file through the virtual operating environment generated based on the container technology. After the test file calls the API provided by the virtual operating environment, the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved. Therefore, the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), so it can be a certain degree Solve the limitations of existing malicious file detection technology. In addition, cross-platform malicious program detection can be realized. In addition, because the virtual operating environment is generated based on container technology, the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space. The detection method of the embodiment of the present application realizes the operation of the malicious program at the process level, and the detection speed is faster. In addition, time-consuming and performance overhead caused by repeated resetting of virtual machines can be avoided, and overhead caused by operations such as the creation and scheduling of traditional virtual machines can be avoided.

Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device separately obtains data from the dynamic link library of the virtual operating environment according to each API in the first API sequence. The corresponding function in the first function sequence is obtained, thereby obtaining the first function sequence. The functions included in the first function sequence are used to implement the API included in the first API sequence; Functions, respectively obtaining the mapped functions from the dynamic link library of the second operating system, thereby generating a second function sequence, and the functions included in the second function sequence are used to implement the API included in the second API sequence , The first function in the second function sequence has a mapping relationship with the first function in the first function sequence; the detection device is in the kernel of the second operating system, according to the second function sequence Perform the operation.

The above process provides an alternative implementation of operating system simulation. The functions that implement the API in the virtual operating environment are encapsulated in a dynamic link library, and the functions that implement the API in the operating system of the detection device are encapsulated in another dynamic link library. When the API sequence of the virtual operating environment is called, by sequentially accessing different dynamic link libraries, a function sequence with similar functions provided by the operating system of the detection device is found. By executing the function sequence, the effect of executing the API sequence of the first operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system .

Optionally, the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence. Obtaining the corresponding function from the dynamic link library in the first API sequence includes: the detection device obtains the corresponding function from the DLL file of the dynamic link library according to each API in the first API sequence; the detection device obtains the corresponding function according to the first API sequence; Each function in a function sequence obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the difference between functions The mapping relationship is to obtain the mapped function from the shared object SO file.

The above process provides an optional implementation method for simulating the Windows operating system under the Linux operating system. Encapsulate the functions that implement the API in the virtual runtime environment in a DLL file, and encapsulate the functions that implement the API in the Linux operating system in an SO file. In the process of running the test file, when the API sequence of the virtual running environment is called, by sequentially accessing the DLL file and the SO file, a function sequence similar to the function of the Windows operating system provided by the Linux operating system is found. By executing the function sequence, the effect of executing the API sequence of the Windows operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Windows operating system. In this way, even if the test file is a PE file, and the operating system of the detection device is the Linux operating system, this method can dynamically detect the PE file under the Linux operating system, thereby getting rid of the dependency of the detected PE file on the Windows operating system. The detection equipment of the X86 platform can dynamically detect the test files running on the Windows system, thereby expanding the use scenarios of malicious file detection technology.

Optionally, the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the detection device separately obtains data from the virtual operating environment according to each API in the first API sequence. Obtaining the corresponding function from the dynamic link library includes: the detection device obtains the corresponding function from the SO file according to each API in the first API sequence; the detection device obtains the corresponding function according to the first function sequence Each function in the second operating system obtains the mapped function from the dynamic link library of the second operating system, including: the detection device according to each function in the first function sequence and the mapping relationship between the functions, Obtain the mapped functions from the DLL file.

The above process provides an optional implementation method for simulating the Linux operating system under the Windows operating system. By encapsulating the functions that implement the API in the virtual operating environment in the SO file, the functions that implement the API in the Windows operating system are encapsulated in the DLL file. In the process of running the test file, when the API sequence of the virtual operating environment is called, by sequentially accessing the SO file and the DLL file, a function sequence similar to the function of the Linux operating system provided by the Windows operating system is found. By executing the function sequence, the effect of executing the API sequence of the Linux operating system is simulated, so as to achieve the purpose of system simulation, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the Linux operating system. In this way, even if the test file is an ELF file, and the operating system of the detection device is the Windows operating system, this method can dynamically detect the ELF file under the Windows operating system, so as to get rid of the dependence of the detected ELF file on the Linux operating system, so it is based on X86 The detection equipment of the platform can dynamically detect the test files running based on the Linux system, thereby expanding the use scenarios of malicious file detection technology.

Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains the first type of parameters called in the first API sequence, and the parameters included in the first type of parameters Is the input parameter of the API in the first API sequence; in the second operating system, the detection device executes the second API sequence according to the second type of parameters, and the parameters of the second type of parameters are For the input parameters of the API in the second API sequence, the first parameter in the second type parameter has a mapping relationship with the first parameter in the first type parameter.

The above process provides an alternative implementation of system simulation. Taking into account that the input parameters of different APIs may be different, when the input parameters are called in the API sequence of the virtual operating environment, the called input parameters are mapped to the input parameters of the API of the operating system of the detection device, so as to execute according to the appropriate parameters. API sequence, so as to avoid the situation that the incoming parameters are incorrect when executing the API sequence.

Optionally, the execution of the second API sequence by the detection device in the second operating system includes: the detection device obtains a first instruction sequence triggered during the running of the test file, and the first instruction sequence includes at least An instruction, each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence; the detection device performs a first instruction conversion on the instructions in the first instruction sequence, according to A second instruction sequence is obtained as a result of the conversion of the first instruction, the second instruction sequence includes at least one instruction, and each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence, so The first instruction conversion is used to convert instructions in the instruction set based on the first operating system into instructions in the computer instruction set of the detection device; the detection device executes the second sequence of instructions to implement the The operation corresponding to the second API sequence.

Through this optional method, if the test file is an executable file written through instruction set A, the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from the instruction set The instructions in A are converted to instructions in the instruction set B. In this way, the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on a specific instruction set architecture, thereby ensuring that the scheme of detecting malicious files is widely used in various hardware environments.

Optionally, the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the detection device performs the first instruction on the instructions in the first instruction sequence. An instruction conversion to obtain a second instruction sequence according to the result of the first instruction conversion includes: the detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains the ARM instruction according to the converted ARM instruction The second sequence of instructions.

Through this optional method, if the test file is an executable file written by the X86 instruction set, the CPU of the test device is a CPU of the ARM instruction set architecture, and the test device converts the X86 instructions triggered by the test file into ARM instructions. In this way, the CPU of the test device can execute the ARM instruction, thereby running the test file normally. It can be seen that this technical means can get rid of the dependence of running test files on the X86 instruction set architecture, and therefore can ensure that the program for detecting malicious files is widely used in the ARM hardware environment.

Optionally, the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the detection device performs a first instruction conversion on the instructions in the first instruction sequence, and according to the first instruction Obtaining the second instruction sequence as a result of an instruction conversion includes: the detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtaining the second instruction sequence according to the converted X86 instruction.

Through this optional method, if the test file is an executable file written by the ARM instruction set, the CPU of the test device is a CPU of the X86 instruction set architecture, and the test device converts the ARM instructions triggered by the test file into X86 instructions. In this way, the CPU of the test device can execute X86 instructions to run the test file normally. It can be seen that this technical method can get rid of the dependence of the running test file on the ARM instruction set architecture, so it is guaranteed that the program of detecting malicious files is widely used in the X86 hardware environment.

Optionally, after the detection device executes the second API sequence in the second operating system, the method further includes: the detection device obtains a third instruction sequence, where the third instruction sequence represents execution of the second API As a result obtained after the sequence, the instructions in the third instruction sequence belong to the computer instruction set of the detection device; the detection device performs a second instruction conversion on each instruction in the third instruction sequence, and performs a second instruction conversion according to the second instruction sequence. A fourth instruction sequence is obtained as a result of the instruction conversion. The instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment, and the second instruction conversion is used to convert the instructions in the computer instruction set of the detection device Is an instruction in an instruction set based on the first operating system; the detection device inputs the fourth instruction sequence into the virtual operating environment.

Through this optional method, the execution result of the API sequence can be converted into a form compatible with the test file, and returned to the test file running in the virtual operating environment, so that the test file can continue to run according to the result of the previous call to the API sequence, continuously Express dynamic behavior.

Optionally, the container technology includes a Docker container technology, the virtual operating environment is started by a Docker daemon, and the Docker daemon is a process run by the detection device based on the second operating system.

In this optional manner, based on the Docker container technology, a virtual operating environment is started through a Docker daemon, and the virtual operating environment is, for example, an instance of a Docker container. By using Docker containers, it can have the advantage of being lighter and realize process-level malicious file detection.

In a second aspect, a device for detecting malicious files is provided. The device for detecting malicious files includes at least one module, and at least one module is used to implement the malicious file provided in the first aspect or any one of the optional methods of the first aspect. The detection method. For specific details of the device for detecting malicious files provided by the second aspect, reference may be made to the foregoing first aspect or any one of the optional methods of the first aspect, which will not be repeated here.

In a third aspect, a detection device is provided. The detection device includes a processor configured to execute instructions so that the detection device executes the malicious file provided in the first aspect or any one of the optional methods of the first aspect. Detection method. For specific details of the detection device provided by the third aspect, reference may be made to the foregoing first aspect or any of the optional methods of the first aspect, and details are not described herein again.

In a fourth aspect, a detection device is provided, the detection device including a network interface, a memory, and a processor connected to the memory,

The network interface is used to obtain a test file, and the test file is an executable file running based on a first operating system;

The memory is used to store program instructions;

The processor is configured to execute the program instructions, so that the detection device performs the following operations:

Running the test file in a virtual operating environment, the virtual operating environment being generated based on container technology;

Obtain the first API sequence called during the running of the test file, the first API sequence includes at least one API, the API included in the first API sequence is the API in the first API set, and the first API sequence The API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second API set includes all APIs. Multiple APIs required for software operation provided by the first operating system; a second API sequence is executed in the second operating system, the second API sequence includes at least one API, and the second API sequence includes APIs The API in the second operating system, the first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is based on the detection device An operating system based on a computer instruction set architecture; determining whether the test file is a malicious file based on the behavior characteristics of the test file when the first API sequence is called.

Optionally, the detection device provided in the fourth aspect is further configured to execute the malicious file detection method provided in any of the above-mentioned optional methods in the first aspect. For specific details of the detection device provided in the fourth aspect, reference may be made to the foregoing first aspect or any of the optional methods of the first aspect, and details are not described herein again.

In a fifth aspect, a computer-readable storage medium is provided, the storage medium stores at least one instruction, and the instruction is read by a processor to make a detection device execute the first aspect or any one of the optional methods of the first aspect The malicious file detection method provided.

In a sixth aspect, a computer program product is provided. When the computer program product runs on a detection device, the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods in the first aspect. .

In a seventh aspect, a chip is provided, when the chip runs on a detection device, the detection device executes the malicious file detection method provided in the first aspect or any one of the optional methods of the first aspect.

Description of the drawings

Fig. 1 is a schematic diagram of a network system provided by an embodiment of the present application;

FIG. 2 is a schematic structural diagram of a detection device provided by an embodiment of the present application;

FIG. 3 is a logical functional architecture diagram for detecting malicious files provided by an embodiment of the present application;

FIG. 4 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 5 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 6 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 7 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 8 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 9 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 10 is a flowchart of a method for protecting network security according to an embodiment of the present application;

FIG. 11 is a schematic diagram of an enterprise network provided by an embodiment of the present application;

FIG. 12 is a flowchart of a method for protecting network security according to an embodiment of the present application;

FIG. 13 is a flowchart of a method for protecting network security according to an embodiment of the present application;

FIG. 14 is a flowchart of a method for protecting network security according to an embodiment of the present application;

FIG. 15 is a flowchart of a method for protecting network security according to an embodiment of the present application;

FIG. 16 is a schematic diagram of a virtualization architecture provided by an embodiment of the present application;

FIG. 17 is a flowchart of a method for detecting malicious files provided by an embodiment of the present application;

FIG. 18 is a schematic structural diagram of a malicious file detection apparatus provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solutions, and advantages of the present application clearer, the following further describes the embodiments of the present application in detail with reference to the accompanying drawings.

In this application, the terms "first", "second" and other words are used to distinguish the same or similar items that have basically the same function and function. It should be understood that between "first", "second" and "nth" There are no logic or timing dependencies, and no restrictions on the number and execution order.

The term "at least one" in this application means one or more, and the term "multiple" in this application means two or more. For example, multiple second messages mean two or more More than one second message. The terms "system" and "network" are often used interchangeably in this document.

It should also be understood that the term "if" can be interpreted to mean "when" ("when" or "upon") or "in response to determination" or "in response to detection." Similarly, depending on the context, the phrase "if it is determined..." or "if [the stated condition or event] is detected" can be interpreted to mean "when determining..." or "in response to determining..." "Or "when [stated condition or event] is detected" or "in response to detecting [stated condition or event]".

In the following, the technical terms involved in this application are introduced.

The instruction set refers to a set of instructions that the processor can recognize. The instruction set may include a complex instruction set (full name: Complex Instruction Set Computing, abbreviation: CISC) and a reduced instruction set RISC (full name: reduced instruction set Computing, abbreviation: RISC).

X86 is a computer language instruction set executed by a microprocessor. X86 refers to a standard number abbreviation of Intel's general-purpose computer series. X86 also identifies a set of general computer instructions. X86 is CISC. The name of X86 comes from the Intel 8086 central processing unit introduced in 1978.

The X86 architecture is usually used to refer to processors that are compatible with the X86 instruction set. The X86 architecture is widely used in personal computers (Personal Computer, abbreviation: PC).

The X86 platform generally refers to hardware devices based on the X86 architecture. This hardware device uses Intel or other processors compatible with the X86 instruction set. In addition, the hardware device is usually installed with a Microsoft Windows operating system (Windows) operating system. For example, the hardware device is an X86 server.

ARM is a 32-bit RISC. The ARM architecture is usually used to refer to processors that are compatible with the ARM instruction set. The ARM architecture is widely used in mobile terminals.

ARM platform generally refers to hardware devices based on the ARM architecture. The hardware device uses a processor compatible with the ARM instruction set. In addition, the hardware device is usually installed with a Linux operating system (a set of free-to-use and freely distributed Unix-like operating systems).

An executable file refers to a file that can be loaded and executed by the operating system. Optionally, under the Windows operating system, executable files include portable executable (PE) files, and PE files include .exe files, .sys files, .com and other types of files. Among them,. Is the separator between the file name and the extension. For example, an .exe file is a file with the extension exe. Under the Linux operating system, executable files include executable and linkable format (Executable and Linkable Format, abbreviation: ELF) files.

A malicious file refers to a file containing a piece of program written by the programmer for attacking intent. Malicious files use vulnerabilities in computer systems to perform malicious tasks, such as stealing confidential information, destroying stored data, and so on. Malicious files are often executable files, such as viruses, worms, and Trojan horse programs that perform malicious tasks on computer systems. Because malicious files can cause serious damage to computer system security.

Static testing refers to a method of program analysis without running a computer program. For example, only by analyzing the source code, assembly, grammar, structure, process, interface, etc. of the sample file to check whether the sample file is malicious.

Dynamic behavior detection refers to simulating the execution process of the test file, obtaining the behavior or behavior sequence generated during the execution of the test file, matching it with the dynamic behavior characteristics of the known malicious file, and judging whether the test file is a malicious file according to the matching result. Nowadays, many dynamic behavior detection technologies are restricted by the operating environment. Specifically, after obtaining the test file, it often appears that the test file cannot be run due to the incompatibility of the test file in the operating environment, so that the dynamic behavior cannot be monitored. Malicious files cannot be detected. Dynamic behavior detection is usually implemented using sandbox technology.

Sandbox is a security mechanism that provides an isolation environment for test files in execution by providing a virtual operating environment. Programs running in the sandbox will not have a permanent impact on the hardware. Optionally, the sandbox can be implemented through the real operating system of the host, or through a virtual machine. In order to collect the behavior or behavior sequence generated during the running of the test file, it is necessary to add a monitoring program in the sandbox. In the virtual machine of the Windows operating system, the driver framework provided by Microsoft is usually used to add monitoring programs. The monitoring programs monitor process creation, document creation, registry modification and other behaviors.

Operating System (Operating System, abbreviation: OS) refers to a computer program that manages and controls computer hardware resources and software resources. The operating system is the basic system software, which is the interface between computer hardware and other software, and other software must run under the support of the operating system. The operating system can provide a running environment for executable files, such as providing APIs required for software operation.

The kernel of an operating system is a kind of software, the kernel is a part of the operating system, and the kernel is the core of the operating system. The kernel of the operating system can be used to manage various resources of the operating system. The kernel can be understood as a bridge between applications and hardware, or as an interface between applications and hardware. The kernel is a software entity that runs directly on the hardware and is used to provide application programs with access to computer hardware. In addition, the kernel can determine when and how long a program operates on a certain part of the hardware. The kernel can provide hardware abstraction layer, disk and test file system control, multitasking and other functions.

Container technology is a kind of virtualization technology. Container technology can be used to generate containers, which can provide a virtual operating environment for software execution. A container is a kind of software. A container is a packaging of executable files and the resources that executable files depend on. The container contains the resources necessary for the executable file to run. Such as code, operating environment, system tools, system libraries and settings. The container can be created by mirroring. Compared with virtual machines, containers have the advantage of being lighter, occupying fewer resources, and running faster. Specifically, a virtual machine packs virtual hardware, kernel (ie operating system), and user space in a new virtual machine. When running applications through the virtual machine, the virtual machine first needs to virtualize a physical environment, and then build a complete operating system , Build another layer of runtime (Runtime), and then the application program runs. The container usually directly installs the container layer on the operating system of the host. The container layer can be, for example, a Linux container (Linux Container, LXC) or lib container (a package for container management in Docker, which is implemented based on the Go language). Among them, there is no operating system inside the container, the container uses the kernel of the physical machine to run, and multiple containers can share the operating system of the physical machine. Since the container directly uses the kernel of the host, the process of building an operating system and the process of assigning an independent operating system to the applications contained in the container are eliminated, and there are fewer virtualized objects. For example, in some cases, it is necessary to build the container. Only binary files and libraries are built independently for the container. The library contains the content that the binary files depend on, and does not need to package a complete operating system like a virtual machine, so it is lighter and faster to start. In addition, container management also has more convenient advantages. Specifically, the running state of the container corresponds to a set of standard management operations, for example, starting the container, stopping the container, suspending the container, deleting the container, etc. The container can be conveniently managed through these standard management operations.

The image (image) is used to encapsulate the content required to run the container, such as files such as programs, libraries, resources, configuration, and some configuration parameters. Among them, the image is usually stored in a hierarchical structure. The image includes at least one image layer (image layer). For example, the image layer is a read-only template that is used to build the container. The mirror layer is used to store applications and migrate applications.

Cross-platform is a term in software technology. It refers to applications developed under one operating system that can still run normally under another operating system. For example, if application A is developed under the Windows operating system, and the application A can still run normally under the Linux operating system, application A can be called a cross-platform application. Under normal circumstances, cross-platform applications must meet the conditions of not relying on the operating system.

An application programming interface (application programming interface, API) is a communication interface between an operating system and an application program. The operating system provides an API for the application program, and the application program calls the API to instruct the operating system to perform operations. From a technical point of view, API is a set of preset functions. In layman's terms, the operating system can be regarded as a service center, which can provide various services for applications, and will encapsulate the instructions for implementing various services in various functions. If the application wants to use a certain service of the operating system, it will call For the function corresponding to the service, the operating system will perform the operation corresponding to the function to provide services for the application. Since the service object of this kind of function is an application, this kind of function is called an application programming interface. Through API, it can provide applications and developers with the ability to access a set of routines based on software or hardware. At the same time, it eliminates the learning cost of accessing source code and understanding the internal working mechanism, and reduces complexity.

Docker is a software container platform launched by Google, which can realize the development, deployment and operation of containers. With Docker, you can easily create and use containers, and put your own applications into the container. Containers can also perform version management, copy, share, and modify, and can achieve continuous integration, continuous delivery, and deployment through custom application images. Docker generally includes a Docker client (Docker Client) and a Docker daemon (Docker Daemon). Docker Daemon, also known as Docker Engine (Docker Engine), is a daemon process used to manage images and containers. It is a background process running on the operating system. The Docker client can trigger various instructions according to the user's input operations, and interact with the Docker daemon through various instructions. The Docker daemon receives instructions from the Docker client, creates corresponding jobs according to the instructions of the Docker client, and executes the corresponding jobs .

The instance of Docker container is the running state of the Docker image. Docker containers can be created, started, stopped, deleted, suspended, etc. The user inputs a viewing instruction (for example, a Docker ps instruction), and the computer will respond to the viewing instruction and display a list of Docker containers running on the host. Docker containers have the advantage of being lighter. Specifically, a virtual machine usually includes a virtual machine management system (Hypervisor) and a guest operating system (Guest Operating System, Guest OS). The hypervisor is used to run a virtual guest operating system on the host operating system and to virtualize hardware resources. . The disk space, CPU, and memory occupied by the guest operating system are very large. In Docker containers, Docker Daemon is used to replace Hypervisor and Guest OS. The Docker daemon is a background process running on the operating system and is responsible for managing Docker containers. The Docker daemon can directly communicate with the operating system of the host and allocate resources for each Docker container, eliminating the need for virtual machines to communicate indirectly through the Hypervisor. Coming overhead. In addition, Hypervisor virtualizes hardware resources, and Docker can directly use hardware resources, thereby improving the utilization of hardware resources.

A Docker image (Docker image) is used to create a Docker container. A Docker image is an executable package that contains the content required to run a Docker application, such as the code, libraries, environment variables, and configuration files of the Docker application. The Docker image can run in an environment with Docker Engine. When the Docker image runs, it will be created as a Docker container. The Docker container can shield the software and hardware outside the container. Optionally, the Docker image includes a metadata file, a configuration file, and at least one image layer file. For example, the metadata file is a manifest.json file, and the metadata file records the metadata of all mirror layer files, for example, records the sha256 value (hash value) of each mirror layer file. The configuration file records the memory size occupied by the Docker image, the type of instructions contained in the Docker image, etc. The image layer file is a layer file.

The dynamic link library is used to encapsulate the functions and resources that the running process of the application depends on. The dynamic link library is also called shared function library or shared library. The functions and resources in the dynamic link library can be shared by multiple applications. Through the dynamic link library, it helps to avoid code reuse and promote the effective use of memory, making the application modularize each function. The dynamic link library is usually stored in the computer in the form of a test file. Optionally, in different operating systems, the test files encapsulated with the dynamic link library have different formats and have different titles. For example, under the Windows operating system, the dynamic link library is encapsulated in a DLL file; under the Linux operating system, the dynamic link library is encapsulated in a shared object (so) file.

The dynamic link library is an implementation of the API. Specifically, the DLL can encapsulate the code of the API and serve as the carrier of the API. In the process of executing the application program, if the application program triggers a call instruction to the API, the operating system will access the dynamic link library, obtain the code of the API from the dynamic link library, and run the code to perform the corresponding operation. For example, the operating system can provide a registry API, and when an application calls the registry API, it can access the registry. The code that needs to run when using the registry API can be stored in the dynamic link library.

The DLL file is a test file that contains a dynamic link library in the Windows operating system. The DLL file contains many functions and resources that a Windows program depends on when it runs in the Windows environment. DLL files are also called "application extensions". The suffix of the DLL file is .dll. For example, the DLL file includes the kernel32.dll file, the user32.dll file, and the gdi32.dll file. These three test files encapsulate the API functions of the Windows operating system. Optionally, the DLL file is stored in the system directory, for example, in the C:\Windows\System32\ directory. Among them, the kernel32.dll file is an important 32-bit dynamic link library test file in Windows 9x/Me, which is a kernel-level test file. user32.dll is a Windows user interface related application program interface, used to include Windows processing, basic user interface and other features, such as creating windows and sending messages. gdi32.dlll is a dynamic link library stored in the Windows system test folder. It is an application extension of the graphical user interface under Windows. It is usually created automatically during the installation of the operating system. Many application programs are not a complete executable file. This kind of application program will be divided into some relatively independent dynamic link libraries, namely DLL files, which are placed in the system. When the application is executed, the DLL file corresponding to the application will be called. Among them, one application can use multiple DLL files, and one DLL file may also be used by different applications.

The ntdll.dll file is a kind of DLL file. The ntdll.dll file contains functions for calling the kernel, which can be understood as the core DLL file. In the Windows operating system, when an application calls the Windows API, it will access a series of DLL files, and the calls to the functions in the DLL files will eventually be directed to ntdll.dll. When the functions in the ntdll.dll file are called, The kernel will be called to perform the operation corresponding to the function. The ntdll.dll file is the entry point of the Windows system from ring3 to ring0. All win32APIs located in kernel32.dll and user32.dll are finally implemented by calling functions in ntdll.dll. The function in ntdll.dll uses SYSENTRY to enter ring0, and the implementation entity of the function is in ring0.

A shared object (so) file is a file containing a dynamic link library in the Linux operating system. The SO file includes the functions that the application of the Linux operating system depends on when running based on the Linux operating system. The suffix of SO files is .so. SO file is a binary ELF file. SO files are also called shared libraries or shared object libraries.

Hereinafter, the application scenario of the malicious file detection method provided by the embodiment of the present application is exemplarily introduced.

Please refer to FIG. 1. FIG. 1 is a schematic diagram of an application scenario of a malicious file detection method provided by an embodiment of the present application. The network system in FIG. 1 includes a detection device. Optionally, the scenario shown in FIG. 1 also includes an analysis device in the cloud.

The network system in FIG. 1 includes a data center 1102, a core office area, an office area A, and an office area B, and their respective local area networks 1103. The local area networks 1103 of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch. The firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a network address translation (NAT) device (not shown in the figure), a gateway device (not shown in the figure), and so on. The firewall 1105 is used to isolate the network system from the wide area network or the Internet, and to perform security protection for data interacting between the network system and the wide area network or the Internet.

As shown in Figure 1, a possible deployment location of the detection device is the network exit of the network system, that is, between the firewall 1105 and the router 1101. For example, the detection device is integrated in an egress firewall, an egress router, or a bypass firewall. Detection equipment is used to prevent malicious test files from the Internet and malicious web traffic. Optionally, the detection device is any one of a gateway device, a firewall device, an intrusion detection system (Intrusion Detection System, IDS) type device, and an intrusion prevention system (Intrusion Prevention System, IPS) type device. Among them, the IDS category refers to the monitoring of the network and system operating conditions through software and hardware in accordance with certain security policies, and discovering various attack attempts, attack behaviors or attack results to ensure the confidentiality and integrity of network system resources And availability. The IPS category refers to monitoring the message transmission behavior of the network or network equipment, instantaneous interruption, adjustment or isolation of some abnormal or harmful message transmission behavior. Optionally, the detection device may also be an independent sandbox device or other devices that integrate sandbox functions. For example, the detection device can be a security gateway, a firewall, and so on. Among them, independent sandbox devices are usually deployed at the Internet egress of the enterprise in a bypass manner. For example, the enterprise's local area network is connected to the Internet through a gateway device or router, and the sandbox device is connected to the gateway device or router in a bypass manner. .

In a possible implementation, the detection device is integrated in the analysis device in the cloud. Testing equipment provides testing services to other network equipment through network apps and open service ports. In this case, the detection device receives a test file from other network devices in the network (such as a security gateway, firewall, etc.), and after performing the detection method shown in the embodiment of the present application on the test file, whether the test file is a malicious file The test results are returned to other network devices that provide test files.

The above describes the application scenarios of the malicious file detection method, and the following exemplarily introduces the detection device provided in the embodiment of the present application. Optionally, the detection device is the detection device in the network system shown in FIG. 1.

Please refer to FIG. 2, which is a schematic structural diagram of a detection device provided by an embodiment of the present application.

The detection device shown in FIG. 2 includes at least one processor 21, at least one memory 22, and a network interface 23. The processor 21, the memory 22, and the network interface 23 are connected to each other through a bus 24.

Optionally, the processor 21 is a general-purpose central processing unit (CPU), a network processor (NP), a microprocessor, or one or more integrated circuits for implementing the solution of the present application, for example, Application-specific integrated circuit (ASIC), programmable logic device (PLD) or a combination thereof. Optionally, the above-mentioned PLD is a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any of them combination. Optionally, the processor 21 is a single-CPU. Optionally, the processor 21 is a multi-core processor (multi-CPU).

Optionally, the memory 22 includes a register and a volatile memory (volatile memory), such as a random-access memory (RAM); optionally, the memory includes a non-volatile memory (non-volatile memory), For example, flash memory (flash memory), hard disk drive (HDD) or solid-state drive (SSD), cloud storage, network attached storage (NAS), network disk Optionally, the memory may also include a combination of the above-mentioned types of memory or other media or products in any form with a storage function. For example, the memory 22 includes electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disc storage, optical disc storage (including compressed Optical discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or can be used to carry or store desired program codes in the form of instructions or data structures and can be accessed by a computer Any other media, but not limited to this. Optionally, the memory 22 exists independently and is connected to the processor 21 through the bus 24. Optionally, the memory 22 and the processor 21 are integrated together.

The network interface 23 uses any device such as a transceiver for communicating with other devices or a communication network. The network interface 23 includes a wired communication interface. Optionally, the network interface 23 also includes a wireless communication interface. The wired communication interface is, for example, an Ethernet interface, such as a Gigabit Ethernet (GE for short) interface. Optionally, the Ethernet interface is an optical interface, an electrical interface or a combination thereof. Optionally, the wired communication interface is a fiber distributed data interface (Fiber Distributed Data Interface, FDDI for short) interface. Optionally, the wireless communication interface is a wireless local area network (WLAN) interface, a cellular network communication interface, or a combination thereof.

The bus 24 is used to transfer information between the above-mentioned components. Optionally, the bus 24 is divided into an address bus, a data bus, a control bus, and the like. For ease of representation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus. For example, the bus 24 is a high-speed serial computer expansion bus standard (peripheral component interconnect express, abbreviated as PCIe) bus, Advanced Microcontroller Bus Architecture (AMBA) bus communication, and cache-coherent system (Huawei cache-coherent). system, HCCS, a protocol standard for maintaining the consistency of service data between multiple ports) bus or peripheral component interconnection standard (peripheral component interconnect, PCI for short) bus.

Optionally, the detection device in FIG. 2 further includes an input and output interface 25, and the input and output interface 25 is used to connect the output device and the input device. The input device communicates with the processor 21. Optionally, the input device receives the user's input in multiple ways. For example, the input device is a mouse, a keyboard, a touch screen device, or a sensor device. The output device communicates with the processor 21. Optionally, the output device displays information in multiple ways. For example, the output device is a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector.

The hardware structure of the detection device is exemplarily introduced above, and the software architecture of the detection device is exemplarily described below.

Optionally, the software of the detection device adopts a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture, etc. The software of the detection device includes at least one functional module, and each functional module is implemented by software. In other words, the functional module is generated after the processor 21 of the detection device reads the program code stored in the memory 22.

For example, referring to FIG. 2, the software of the detection device includes a container 221, an instruction conversion module 222, and an operating system 223.

The container 221 is used to provide a virtual operating environment.

The instruction conversion module 222 is used to convert the instructions transmitted between the container 221 and the operating system 223. For example, when the container 221 sends an instruction to the operating system 223, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the operating system 223. For another example, when the operating system 223 sends an instruction to the container 221, the instruction conversion module 222 intercepts the instruction, converts the instruction, and sends the converted instruction to the container 221.

Optionally, the instruction conversion module 222 is provided outside the container 221.

Optionally, the software of the detection device further includes a container 224 and an instruction conversion module 225. The instruction conversion module 225 is provided in the container 224. The instruction conversion module 225 is used to convert the instructions transmitted between the container 224 and the operating system 223.

Optionally, the virtual environment provided by the container 224 is different from the virtual environment provided by the container 221. For example, the container 221 provides a virtual operating environment for simulating the Windows operating system, and the container 224 provides a virtual operating environment for simulating the Linux operating system. In this way, the same testing device can use the container 221 and the container 224 to provide multiple virtual operating environments. .

Taking the ARM instruction set architecture as an example of the computer instruction set architecture of the detection device, the software architecture shown in FIG. 2 will be described as an example.

Optionally, the example of the Docker container in FIG. 3 is the container in FIG. 2. Please refer to Fig. 3. For example, the container 221 in Fig. 2 is Docker_1 or Docker_2 in Fig. 3. The container 224 in FIG. 2 is Docker_n in FIG. 3. Among them, Docker_1, Docker_2, and Docker_n are three instances of Docker containers. For example, the instruction conversion module 222 in FIG. 2 is the instruction conversion process 1 or the instruction conversion driver 1 in FIG. 3. The instruction conversion module 225 in FIG. 2 is the instruction conversion process 2 or the instruction conversion driver 2 in FIG. 3. For example, the operating system 223 in FIG. 2 is the Linux operating system in FIG. 3, and the Linux operating system can run based on the ARM instruction set.

The container in Figure 3 can realize the function of system simulation. Refer to Figure 3, ntdll.dll, kernel32.dll and behavior monitoring modules included in Docker_1, where ntdll.dll and kernel32.dll are used for system simulation. ntdll.dll and kernel32.dll are dynamic link libraries of the virtual operating environment. The behavior monitoring module is used to monitor the dynamic behavior of the test file in the container.

Optionally, the Docker instance communicates with the operating system through the instruction conversion module. In other words, the instruction conversion module serves as a communication medium between the Docker instance and the operating system. The following is an example of way one and way two.

Method 1: An instruction conversion module is set outside the Docker instance. Specifically, the instruction conversion module may receive the instruction generated by the Docker instance, and after the instruction conversion module converts the instruction, it sends the converted instruction to the operating system. For example, please refer to Figure 3. Docker_1 is externally provided with instruction conversion process 1 or instruction conversion driver 1. After Docker_1 generates X86 instructions, Docker_1 sends the X86 instructions to instruction conversion process 1 or instruction conversion driver 1, instruction conversion process 1 or instruction The conversion driver 1 receives X86 instructions, converts X86 instructions into ARM instructions, and sends them to the Linux operating system. After the Linux operating system executes the ARM instruction, it carries the execution result in the ARM instruction and returns it to instruction conversion process 1 or instruction conversion driver 1. Instruction conversion process 1 or instruction conversion driver 1 converts ARM instructions to X86 instructions, and returns X86 instructions to Docker_1. In this way, Docker_1 communicates with the Linux operating system through the instruction conversion process 1 or the instruction conversion driver 1. In this way, the Docker instance and the instruction conversion module have a clear role and division of labor. The function of system simulation is realized by the Docker instance, and the function of instruction conversion is realized by the instruction conversion module, thereby decoupling the two functions of system simulation and instruction conversion. It is convenient to expand, upgrade and update the functions of the system simulation separately in the future.

Method 2: The Docker instance contains an instruction conversion module inside. The Docker instance communicates with the operating system through the internal instruction conversion module. For example, please refer to Figure 3. Docker_n is equipped with instruction conversion process 2 or instruction conversion driver 2. After the system simulation part in Docker_n generates X86 instructions, it sends the X86 instructions to instruction conversion process 2 or instruction conversion driver 2, and the instruction is converted Process 2 or instruction conversion driver 2 receives X86 instructions, converts X86 instructions to ARM instructions, and Docker_n sends the ARM instructions to the Linux operating system. After the Linux operating system executes the ARM instruction, the execution result is carried in the ARM instruction and returned to Docker_n. Docker_n receives the ARM instruction, and converts the ARM instruction to X86 instruction through its own instruction conversion process 2 or instruction conversion driver 2, and the instruction conversion process 2 or instruction conversion driver 2 returns the X86 instruction to the part of the operating system simulation in Docker_n. In this way, Docker_n communicates with the Linux operating system through the instruction conversion process 2 or the instruction conversion driver 2 contained inside.

It should be understood that when the software architecture described in Figures 2 and 3 detects malicious files, only the division of the above-mentioned functional modules is used as an example for illustration. In practical applications, the above-mentioned functions can optionally be allocated to different functional modules as needed. Complete, that is, divide the internal structure of the detection device into different functional modules to complete all or part of the functions described above. In other words, the functional modules in Fig. 2 and Fig. 3 can be combined or split without affecting the overall function of the detection device. In addition, the software architecture provided in FIG. 2 and FIG. 3 belong to the same concept as the method embodiment provided in FIG. 4 below, and the specific implementation process is detailed in the method embodiment, which will not be repeated here.

The hardware structure and software architecture of the detection device are introduced above, and the following exemplarily introduces the method flow of the detection device for detecting malicious files.

Please refer to FIG. 4. FIG. 4 is a flowchart of a method for detecting malicious files according to an embodiment of the present application. FIG. 4 uses the detection device as the execution subject as an example for description. The method includes the following steps 401 to 406.

Step 401: The detection device generates a virtual operating environment based on the container technology.

Optionally, the virtual operating environment is isolated from the real operating environment of the host, so that the test files running in the virtual operating environment will not have a permanent impact on the hardware. The detection device runs the test file in the virtual operating environment, and then deletes the changes produced by the running test file. Optionally, the virtual operating environment is implemented through container technology, and the virtual operating environment is an instance of the container. For example, please refer to Figure 3. If a virtual operating environment is generated based on Docker container technology, the virtual operating environment is provided by Docker_1, Docker_2 or Docker_n in Figure 3.

Optionally, in some embodiments, a virtual operating environment is generated based on the image of the container. For example, based on the Docker container technology, the virtual operating environment is started through the Docker daemon, and the virtual operating environment is an instance of the Docker container. By using the Docker container instance, it has the advantage of lighter weight and realizes the process-level malicious file detection.

It should be understood that step 401 is only described by taking the process of generating a virtual operating environment as an example. In some optional embodiments, the detection device generates multiple virtual environments. When the detection device includes a larger number of virtual operating environments, the generation process of each virtual operating environment may be similar to step 401. By generating multiple virtual operating environments, multiple sample files can be detected in parallel, thereby improving the detection efficiency. In addition, optionally, different virtual operating environments are used to simulate different operating systems, for example, virtual operating environment A simulates a Windows operating system, and virtual operating environment B simulates a Linux operating system, so as to adapt to the operating requirements of sample files with different requirements.

Step 402: The detection device obtains a test file.

Optionally, there are multiple ways to obtain the test file. For example, the detection device receives a test file input by the user. In another example, the detection device collects the data stream transmitted in the network, and obtains the file carried by the data stream by reorganizing the payload of the packet contained in the data stream. For another example, the detection device parses the parent file embedded with the test file to obtain the test file carried in the parent file. Among them, the parent file refers to the file in which the test file is embedded. The parent file contains the test file. The parent file is also called the original file, or has different titles according to different manufacturers, standards, or scenarios. For example, the parent file is an email, the test file is an attachment carried in the email, and the attachment is an executable file. The detection device parses the mail and obtains the attachments carried in the mail. For another example, the parent file is a word document, the test file is an executable file linked in the word document, and the detection device parses the word document to obtain the test file.

The test file is an executable file running based on the first operating system. The first operating system is an operating system compatible with the test file. For example, the test file is a PE file and the first operating system is a Windows operating system; for example, if the test file is an ELF file, the first operating system is a Linux operating system. For example, if the test file is an exe file, a .sys file, or a .com file, the first operating system refers to the Windows operating system.

It should be understood that there are many types of executable files and operating systems. The Linux operating system and the Windows operating system are examples of the first operating system. Optionally, the first operating system is the Android operating system (also called the Android operating system, Operating system developed for Google), iOS operating system (Apple's mobile operating system), Mac OS operating system (Apple operating system), BlackBerry operating system (BlackBerry operating system), UNIX operating system (a multi-user, multi-user operating system) Task operating system) or NetWare system (network operating system launched by NOVELL). Correspondingly, the test file is an executable file based on these operating systems. Of course, the operating systems listed above are only examples of the first operating system, and this embodiment does not limit the specific type of the first operating system.

It should be pointed out that the step of generating a virtual operating environment shown in step 401 is executed once, and the step of obtaining a test file shown in step 402 is executed multiple times. In other words, it is not necessary to regenerate the virtual operating environment before each test file is obtained. After the virtual operating environment is generated in step 401, the detection device executes the processes shown in step 402 to step 406 on multiple test files based on the first operating system simulated by the virtual operating environment.

Step 403: The detection device runs the test file in the virtual operating environment.

For example, the detection device transfers the test file to an instance of the Docker container, and sends an instruction to start and run the test file to the instance of the Docker container. The instance of the Docker container will run the test file in response to the instruction to start and run the test file, so that the test file can be run in the virtual operating environment.

In some optional embodiments, the detection device generates only one virtual operating environment, and sends the test file to the virtual operating environment to run. For example, if a test file is to be run by simulating a Windows operating system, a virtual operating environment for simulating the Windows operating system is generated, and the test file is fixedly sent to the virtual operating environment. In other optional embodiments, the detection device generates multiple virtual operating environments, executes the optional processes described in the following steps 4031 to 4033, and sends the test file to the virtual operating environment it needs to run.

Step 4031. The detection device determines the file type of the test file.

Optionally, the detection device uses multiple methods to determine the file type of the test file. For example, the detection device recognizes the file type of the test file according to the file suffix name. For another example, the detection device recognizes the file type of the test file according to the file header information. Specifically, the detection device pre-stores data structures of file headers (or files) of various file types. After receiving the test file, the testing device sequentially compares the file header of the test file with the data structure of the file headers of various file types to obtain the data structure that the file header of the test file conforms to, and use the file type corresponding to the data structure as The file type of the test file. In addition, optionally, the detection device also directly recognizes the file type of the test file according to the suffix name. Other ways for the detection device to determine the file type will not be listed here.

Step 4032. The detection device determines the virtual operating environment required by the test file according to the file type of the test file.

Optionally, the detection device pre-stores the mapping relationship between the file type and the virtual operating environment, and by querying the mapping relationship, the operating environment required for the operation of the test file is determined. For example, the mapping relationship is shown in Table 1.

Table 1

文件类型file type	测试文件需要的虚拟运行环境Virtual operating environment required for test files
PE文件PE file	Docker_1Docker_1
ELF文件ELF file	Docker_2Docker_2

Step 4033: The detection device runs the test file in the virtual operating environment required by the test file.

For example, if the testing device determines that the suffix of a test file is ".elf" through the suffix comparison, it is determined that the file type of the test file is an ELF file, and then the operating environment required for the ELF file is Docker_1 by querying Table 1, Send the test file to Docker_2 to run. Among them, Docker_2 is used to simulate the operating environment of the Linux operating system.

For another example, the detection device determines that the test file conforms to the data structure of the file header of the PE file according to the content of the designated field in the file header of the test file, and therefore determines that the file type of the test file is a PE file. Then, the detection device queries Table 1 to know that the operating environment required by the PE file is Docker_1, and sends the test file to Docker_1 for operation. Among them, Docker_1 is used to simulate the operating environment of the Windows operating system.

Step 404: The detection device obtains the first API sequence called during the running of the test file.

For example, if the test file is a file in the PE format, the two APIs CreateFile() and WriteFile() are called in sequence during the running of the test file, and the first API sequence includes CreateFile() and WriteFile().

The virtual operating environment provides multiple APIs for the software operation provided by the test file. In order to distinguish and describe multiple APIs required for the software operation provided by the test file for the operating system that is simulated by the virtual operating environment and compatible with the test file, the embodiment of the present application sets the virtual operating environment as the multiple required for the software operation provided by the test file. This API is called the first API set. The first API set includes multiple APIs. During the running of the test file in the virtual operating environment, the test file may call all APIs in the first API set; or, the test file may call some APIs in the first API set, and the first API set except some APIs The rest of the API is not called by the test file. In order to distinguish between the API called by the test file and the API actually executed, the embodiment of the present application refers to the series of APIs called by the test file in the first API set as the first API sequence, and the series of APIs actually executed by the testing device as It is the second API sequence. In addition, optionally, in the process of running the test file, the test file only calls one API in the first API set, and the first API sequence only includes this one API. Optionally, as time goes by, the test file successively calls multiple APIs in the first API set, and the multiple called APIs form a first API sequence, and the first API sequence includes multiple APIs. That is, the first API sequence includes at least one API, and this embodiment does not limit whether the first API sequence includes one API or multiple APIs.

In addition, optionally, the first API sequence includes multiple APIs, and different APIs in the first API sequence are sorted in order of the time when they are called. For example, if the test file first calls API_1 provided by the virtual operating environment, then calls API_2 provided by the virtual operating environment, and finally calls API_3 provided by the virtual operating environment, the first API sequence is expressed as (API_1, API_2, API_3).

In the embodiment of the present application, the second API set is used to represent multiple APIs required for software operation provided by the operating system of the test file compatible with the test file simulated by the virtual operating environment. In this embodiment, the second API set is the API provided by the first operating system for the operation of the software required by the test file. The second API set includes multiple APIs.

Wherein, the identifier of the API in the first API set is the same as the identifier of the API in the second API set. Optionally, the identifier of the API is the name of the API. The API identifier is used to identify the corresponding API. The test file calls the API in the currently running operating environment through the identification of the API. In other words, when running the test file in the virtual running environment and running the test file in the first operating system, the test file uses the same API identifier to call the API in the virtual running environment or the API in the first operating system .

For example, the identifier of the API for writing files is WriteFile() or frwite(). Taking the test file as a file in the PE format, the first operating system is the Windows operating system as an example, and the Windows operating system provides the first API set for the PE file. The first API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on. The identifier of the API for writing files is WriteFile(), the identifier of the API for creating files is CreateFile(), and the identifier of the API for reading files is ReadFile(). The virtual runtime environment provides a second set of APIs for the PE file. The second API set includes APIs for writing files, APIs for creating files, APIs for reading files, and so on. The identifier of the API used to write files in the second API set is WriteFile(), the identifier of the API used to create files in the second API set is CreateFile(), and the identifier of the API used to read files is ReadFile().

In some possible embodiments, the image of the aforementioned container may be provided to the user by the software provider. In the process of packaging the image of the container, the software provider encapsulates the first API set in the image. After the detection device generates an instance of the container based on the image, the instance of the container provides the first API set for the test file running in it. For example, to simulate the operating environment provided by the Windows operating system through container technology, the software provider encapsulates some APIs that are the same as those in the Windows operating system in the image. If the container technology is to be used to simulate the operating environment provided by the Linux operating system, the software provider encapsulates some APIs with the same API identification in the Linux operating system in the image.

Step 405: The detection device executes the second API sequence in the second operating system.

The second operating system is an operating system based on the computer instruction set architecture of the detection device. In other words, the second operating system is to detect the real operating environment on the device. For example, the processor of the detection device adopts the X86 architecture, and the second operating system is the Windows operating system (of course, the operating system based on the X86 architecture may also be the Linux operating system, and only the windows operating system is used for illustration here). For another example, the processor of the testing equipment adopts the ARM architecture, and the second operating system is the Linux operating system (of course, the operating system based on the ARM architecture can also be an operating system developed by the testing equipment manufacturer, or a specific windows series operating system , Such as windows 10). In order to be different from the aforementioned API sequence composed of APIs called in the virtual operating environment generated based on container technology, the embodiment of the present application composes the sequence composed of APIs actually executed in the real operating environment (ie, the second operating system) It is called the second API sequence, and the API sequence composed of APIs called in the virtual operating environment generated based on the container technology is called the first API sequence.

Optionally, the second operating system may be the same as the first operating system, and the second operating system may also be different from the first operating system. For ease of understanding, the embodiment of the present application uses the second operating system to be different from the first operating system for illustration. When the second operating system is different from the first operating system, the API in the first API sequence is different from the API in the second API sequence. When the second operating system is the same as the first operating system, the first API The identifier of the API in the sequence may be the same as the identifier of the API in the second API sequence.

In some embodiments, if the first API sequence is called, the detection device determines the second API sequence according to the first API sequence, and by executing the second API sequence in the second operating system, the hardware of the detection device is instructed to perform the corresponding operation , So as to simulate and execute the API sequence of the first operating system, and realize the effect of operating system simulation.

The second API sequence includes at least one API, and the API in the second API sequence is an API in the second operating system. The first API in the second API sequence has a mapping relationship with the first API in the first API sequence. For example, the first operating system is a Windows operating system. The second operating system is the Linux operating system. The first API sequence is (CreateFile(), ReadFile()). The second API sequence is (fcreate(), fread()). The fcreate() and CreateFile() in the second API sequence have a mapping relationship. The fread() and ReadFile() in the second API sequence have a mapping relationship.

In addition, if the second API sequence includes multiple APIs, different APIs in the second API sequence are sorted according to the order of the corresponding APIs in the first API sequence. For the sake of brevity, the following embodiments of this application use the form of "API+number" to simplify the expression of the above-mentioned specific APIs without introducing difficulties in understanding. For example, the specific APIs are CreateFile(), ReadFile(), fcreate(), fread() and so on, the API can be simplified to a form such as API_1.

For example, if the first API sequence is (API_1, API_2, API_3). API_4 and API_1 in the second operating system have a mapping relationship, API_5 and API_2 in the second operating system have a mapping relationship, and API_6 and API_3 in the second operating system have a mapping relationship, so the second API sequence is (API_4, API_5, API_6).

Optionally, in some embodiments, the mapping relationship between the API in the first API sequence and the API in the second API sequence is constructed in this way. Determine the function of the API of the first operating system, determine the function of the API of the second operating system, obtain APIs with similar functions of the first operating system and the second operating system, and establish a mapping relationship for the APIs with similar functions. It should be understood that, optionally, multiple APIs of the second operating system are used to implement one API of the first operating system, and then in the mapping relationship, the multiple APIs of the second operating system correspond to one API of the first operating system. In addition, optionally, an API of the second operating system is used to implement an API of the first operating system, and in the mapping relationship, an API of the second operating system corresponds to an API of the first operating system. This embodiment does not specifically limit whether the mapping relationship between APIs is a one-to-one relationship or a one-to-many relationship.

For example, in the case where the first operating system and the second operating system are respectively the operating systems that the computer device actually runs, the mapping relationship between the first API in the first API sequence and the first API in the second API sequence Constructed like this: if the API with the same identification as the first API in the first API sequence is called in the first operating system, the device will be instructed to perform operation A, if the first in the second API sequence is called in the second operating system After the API, the device is instructed to perform operation B, and operation A is the same as operation B. Optionally, a mapping relationship between the first API in the first API sequence and the first API in the second API sequence is established. For example, under the Windows operating system, the computer device instructs to perform the file writing operation by calling the WriteFile() API; under the Linux operating system, the computer device instructs the file writing operation by calling the frwite() API. Then, since the two APIs WriteFile() and frwite() are used to instruct the execution of file write operations, the mapping relationship between the two APIs WriteFile() and frwite() can be established.

In some embodiments, in the process of packaging the image of the container, the software provider encapsulates the mapping relationship between the API in the first API set and the API in the second operating system in the image, and the detection device generates the container based on the image. After the instance of, the instance of the container can access the mapping relationship between the API in the first API set and the API in the second operating system.

Exemplarily, the process of determining the first API in the second API sequence includes: the detection device determines the first API from the first API sequence. The detection device uses the identifier of the API in the first API sequence as an index to query the mapping relationship between the APIs to obtain the identifier of the first API in the second API sequence. The detection device determines the first API in the second API sequence according to the identifier of the first API in the second API sequence. For example, referring to Table 2 below, if the first API sequence is (CreateFile(), ReadFile()), the detection device determines ReadFile() from the first API sequence. The detection device uses ReadFile() as an index, queries Table 2, and determines that the first API in the second API sequence is fread(). By analogy, optionally, in the same way, other APIs other than the first API in the second API sequence are determined, and the determined APIs are further sorted and combined to obtain the second API sequence.

Table 2

Windows操作系统的APIWindows operating system API	Linux操作系统的APILinux operating system API
WriteFile()WriteFile()	frwite()frwite()
ReadFile()ReadFile()	fread()fread()
CreateFile()CreateFile()	fcreate()fcreate()

Through the above method, the embodiment of the application actually executes the API in the second operating system, but does not actually execute the API in the first operating system, achieving the effect of simulating the running process of the test file in the first operating system. . For example, referring to Table 1 above, the execution process of WriteFile() of the Windows operating system is simulated by executing frwite() of the Linux operating system. By executing fcreate() of the Linux operating system, the execution process of CreateFile() of the Windows operating system is simulated. It can be seen that the virtual operating environment can simulate the API required for software operation provided by the first operating system for the test file. Since the request of the API called by the test file can be responded to, the purpose of operating system simulation is realized, so that the test file can run normally in the virtual operating environment, thereby getting rid of the dependence of the test file on the first operating system.

Optionally, in some possible embodiments, calling the API is implemented based on a dynamic link library. Accordingly, the process of using the API of the second operating system to simulate the API of the first operating system includes the following steps 1 to 3.

Step 1: The detection device obtains the corresponding function from the dynamic link library of the virtual operating environment according to each API in the first API sequence, thereby obtaining the first function sequence.

The code of the test file references functions and resources in the dynamic link library. In the process of running the test file by the testing device, after the test file calls the first API sequence, the testing device accesses the dynamic link library of the virtual operating environment, and obtains the first function sequence from the dynamic link library. The first function sequence includes at least one function, and the functions in the first function sequence are used to implement the API in the first API sequence. For example, if the test file is a file in PE format and the API sequence called by the test file is (CreateFile(), ReadFile()), the function in the first function sequence is the function used to implement CreateFile() or the function used to implement ReadFile( )The function.

Step 2: The detection device obtains the corresponding function from the dynamic link library of the second operating system according to each function in the first function sequence, thereby generating the second function sequence.

The second function sequence includes at least one function, and the functions in the second function sequence are used to implement the API in the second API sequence. For example, if the operating system of the detection device is a Linux operating system, the API in the second API sequence is fcreate(), and the function in the second function sequence is a function for implementing fcreate(). The first function in the second function sequence has a mapping relationship with the first function in the first function sequence. For example, the function used to implement fcreate() has a mapping relationship with the function used to implement ReadFile().

Exemplarily, the process of determining the first function in the second function sequence includes the detection device selecting the first function from the first function sequence. The detection device uses the identifier of the first function in the first function sequence as an index to query the mapping relationship between the identifier of the function in the first function sequence and the identifier of the function in the second function sequence to obtain the The ID of the first function. The detection device determines the first function in the second function sequence according to the identifier of the first function in the second function sequence. By analogy, in the same way, functions in the second function sequence corresponding to other functions in the first function sequence are determined, for example, the second function in the second function sequence corresponding to the second function in the first function sequence and so on.

After the detection device obtains the functions in the second function sequence corresponding to each function in the first function sequence, it combines the obtained functions in the plurality of second function sequences to obtain the second function sequence.

Optionally, it should be understood that if multiple functions of the second operating system are used to implement one function of the first operating system, in the mapping relationship between the functions, the multiple functions of the second operating system correspond to the first operating system. A function of. In addition, optionally, a function of the second operating system is used to implement a function of the first operating system, and then in the mapping relationship, a function of the second operating system corresponds to a function of the first operating system. This embodiment does not specifically limit whether the mapping relationship between functions is a one-to-one relationship or a one-to-many relationship.

In some possible embodiments, in the process of packaging the image of the container by the software provider, the mapping relationship between the function in the first function sequence and the function in the second function sequence is encapsulated in the image, and the container is generated based on the image. After the instance of, the instance of the container can access the mapping relationship between the function in the first function sequence and the function in the second function sequence.

Step 3: The detection device executes operations in the kernel of the second operating system according to the second function sequence.

Through the above implementation, the detection device converts the process of calling the dynamic link library of the virtual runtime environment into the process of calling the function of the second operating system, thereby using the function provided by the second operating system to simulate the function of the first operating system.

Optionally, in some embodiments, it is considered that the parameters of different function calls may have differences, for example, the encoding methods and value ranges of the parameters may have differences. The detection device further converts between the parameters of different functions. For example, the detection device obtains the first type of parameters called in the first API sequence, and the detection device executes the second API sequence according to the second type of parameters in the second operating system.

Among them, the first type of parameters include at least one parameter, and the parameters included in the first type of parameters are input parameters of the API in the first API sequence. The second type of parameter includes at least one parameter. The second type of parameter includes the input parameter of the API in the second API sequence. The first parameter of the second type of parameter has a mapping relationship with the first parameter of the first type of parameter. . For example, the API in the first API sequence is CreateFile(), and the first parameter in the first type of parameters is the parameter representing the file name in CreateFile(). The API in the second API sequence is fcreate(), and the first parameter in the second type of parameters is the parameter representing the file name in fcreate(). The detection device uses the second type of parameter as the input parameter of the second API sequence, and executes the second API sequence.

Exemplarily, the detection device determines the first parameter from the first type of parameters. The detection device uses the identifier of the first parameter as an index to query the mapping relationship between the parameters to obtain the identifier of the first parameter in the second type of parameters. The detection device determines the first parameter in the second parameter according to the identifier of the first parameter in the second parameter. By analogy, the detection device determines parameters other than the first parameter in the second type of parameters in the same way, and the detection device combines the determined parameters to obtain the second type of parameters. The identifier of the first parameter is used to identify the first parameter. For example, the identifier of the first parameter is the name of the first parameter.

In some possible embodiments, in the process of packaging the image of the container, the software provider encapsulates the mapping between the first parameter in the first type of parameter and the first parameter in the second type of parameter in the image. Relationship, after the detection device generates an instance of the container based on the image, the instance of the container can access the mapping relationship between the first parameter in the first parameter and the first parameter in the second parameter.

It should be understood that the above-mentioned acquisition of the function sequence through the dynamic link library is an example. In some optional embodiments, the detection device stores the functions that implement the API in other library files other than the dynamic link library, and the detection device adopts the same principle. Way to obtain the first function sequence and the second function sequence from other library files. Wherein, optionally, the other library file is a static library. Of course, the library file is only an example of the storage method of the function that implements the API, and it does not limit that the function that implements the API must be obtained from the library file. For example, the software provider configures the storage address of the function implementing the API in the container instance in advance, and the container instance accesses the preset storage address to obtain the first function sequence and the second function sequence.

Optionally, in some embodiments, the test file requests to perform operations on the first resource set by calling the first API sequence during the running process, and accordingly, the detection device executes the second API sequence to perform operations on the second resource set . Wherein, the first resource set includes at least one resource. The resources in the first resource set are objects of API operations in the first API sequence. The second resource set includes at least one resource. The resources in the second resource set are objects of API operations in the second API sequence, and the first resource in the second resource set has a mapping relationship with the first resource in the first resource set.

For example, when the test file is running, ReadFile() is called to request access to the file system of Windows. In response to the call request of the test file, the testing device maps the file system of Windows to directory A of Linux, and calls fread() to access Linux. Directory A. In this example, the API in the first API sequence is ReadFile(), and the resource in the first resource set is the Windows file system. The API in the second API sequence is fread(), and the resource in the second resource set is Linux directory A.

In another example, the test file calls API_1 for network communication in Windows during the running process to request network communication through the Windows network system. In order to respond to the call request of the test file, the testing device maps the Windows network system to the Linux protocol stack, and calls API_2 for network communication in Linux to request network communication through the Linux protocol stack. In this example, the resources in the first resource set are Windows network systems, and the resources in the second resource set are Linux protocol stacks.

By analogy, optionally, the test file requests an input/output (IO) device managed by the Windows system to perform an operation by calling the first API sequence, and the detection device executes the second API sequence to control the IO device managed by the Linux system to perform an operation.

Optionally, in some embodiments, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the second operating system. Optionally, the instruction conversion process includes the following steps a to b.

Step a: The detection device obtains the first instruction sequence triggered during the running of the test file.

Optionally, before running the test file, the testing device stores the test file in the disk. At this time, the form of the program in the test file is a bunch of binary codes. When the testing equipment receives the test instruction on the test file, it will respond to the test instruction, read the program in the test file, load the program into the system memory, interpret the program in the system memory as instructions one by one, and execute the instruction to Realize the corresponding function. Among them, instructions are instructions and commands that direct the work of the computer, and a program is a series of instructions arranged in a certain order, and the working process of the computer is the process of executing the instructions.

The test file calls the first API sequence by triggering the first instruction sequence. The first instruction sequence includes at least one instruction, and each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence. Each instruction in the first instruction sequence belongs to a computer instruction set compatible with the operating system simulated by the virtual operating environment. For example, the virtual operating environment is used to simulate the operating environment of the Windows operating system, and each instruction in the first instruction sequence is an X86 instruction in the X86 instruction set. The virtual operating environment is used to simulate the operating environment of the Linux operating system, and each instruction in the first instruction sequence is an ARM instruction in the ARM instruction set.

The detection device receives each instruction triggered by running the test file in the virtual operating environment, and obtains the first instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device receives each instruction triggered by the test file in the instance of the Docker container to obtain the first instruction sequence.

Step b. The detection device performs a first instruction conversion on the instructions in the first instruction sequence, and obtains the second instruction sequence according to the result of the first instruction conversion.

The first instruction conversion is used to convert the instructions in the instruction set on which the first operating system is based into instructions in the computer instruction set of the detection device. For example, if the computer instruction set of the detection device is the ARM instruction set, and the instruction set on which the first operating system is based is the X86 instruction set, the first instruction conversion refers to the process of converting X86 instructions into ARM instructions. For another example, if the computer instruction set of the detection device is the X86 instruction set, and the instruction set on which the first operating system is based is the ARM instruction set, the first instruction conversion refers to the process of converting the ARM instructions into X86 instructions.

The second instruction sequence includes at least one instruction. Each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence. Each instruction in the second instruction sequence belongs to the computer instruction set of the detection device. Therefore, each instruction in the second instruction sequence is an instruction that can be recognized and executable by the architecture of the detection device. For example, if the computer instruction set of the detection device is an ARM instruction set, each instruction in the second instruction sequence is an ARM instruction in the ARM instruction set. If the computer instruction set of the detection device is an X86 instruction set, each instruction in the second instruction sequence is an X86 instruction in the X86 instruction set.

After obtaining the second instruction sequence, the detection device executes the second instruction sequence to implement the operation corresponding to the second API sequence. For example, each instruction in the second instruction sequence is an ARM instruction, and the detection device executes each ARM instruction through the ARM CPU. Since the Linux API is implemented by executing the ARM instruction, the execution of each ARM instruction also implements a series of Operation corresponding to Linux API.

Optionally, in some embodiments, the detection device runs the instruction conversion process, and performs instruction conversion on the first instruction sequence through the instruction conversion process to obtain the second instruction sequence. Optionally, from the perspective of software architecture, the instruction conversion process is deployed outside the virtual operating environment. For example, an instruction conversion process may be inserted between the instance of the Docker container and the second operating system, so that the instruction conversion process is deployed outside the instance of the Docker container. Or, optionally, the instruction conversion process is deployed within the virtual operating environment. For example, set the instruction conversion process in the instance of the Docker container.

Instruction conversion is also called instruction translation. Optionally, in some embodiments, the detection device divides the first instruction sequence into several micro-instructions, each micro-instruction is implemented by a simple piece of code, and then these codes are compiled into a target file, which is then installed on the detection device. The target file is executed, and the target file contains the second sequence of instructions.

By means of instruction conversion, if the test file is an executable file written through instruction set A, the CPU of the test device is a CPU of instruction set B architecture, and the test device performs instruction conversion to convert the instructions triggered by the test file from instruction set A The instructions in are converted to instructions in instruction set B. In this way, the CPU of the test device can execute the instruction triggered by the test file, thereby running the test file normally. It can be seen that the technical means get rid of the dependence of running test files on a specific hardware environment, and ensure that the scheme of detecting malicious files is widely used in various hardware environments.

Optionally, in some embodiments, after the detection device executes the second API sequence in the second operating system, the detection device obtains the third instruction sequence, and the detection device performs the second instruction on each instruction in the third instruction sequence. Conversion, the fourth instruction sequence is obtained according to the result of the second instruction conversion. The detection device inputs the fourth instruction sequence into the virtual operating environment, so that the test file continues to run based on the fourth instruction sequence. For example, if the virtual operating environment is generated based on the Docker container technology, the detection device inputs the fourth instruction sequence into the instance of the Docker container, and continues to run the test file according to the fourth instruction sequence in the instance of the Docker container.

The second instruction conversion is used to convert instructions in the computer instruction set of the detection device into instructions in the instruction set on which the first operating system is based. For example, if the computer instruction set of the detection device is an ARM instruction set, and the instruction set on which the first operating system is based is an X86 instruction set, the second instruction conversion refers to the process of converting ARM instructions to X86 instructions. For another example, if the computer instruction set of the detection device is an X86 instruction set, and the instruction set on which the first operating system is based is an ARM instruction set, the second instruction conversion refers to the process of converting X86 instructions into ARM instructions.

The third instruction sequence represents the result obtained after executing the second API sequence, and the instructions in the third instruction sequence belong to the computer instruction set of the detection device. Optionally, the instruction set corresponding to the third instruction sequence is the same as the instruction set corresponding to the second instruction sequence. The instructions in the third instruction sequence belong to the computer instruction set of the detection device. The instruction set corresponding to the fourth instruction sequence is the same as the instruction set corresponding to the first instruction sequence. The instructions in the fourth instruction sequence belong to the computer instruction set of the virtual operating environment.

For example, if the processor of the detection device is an ARM architecture processor, after the ARM architecture processor executes an ARM instruction, the result obtained is usually in the form of an ARM instruction, and the file in the virtual operating environment needs to determine the result obtained according to the X86 instruction . Then, the instruction conversion process converts the ARM instructions returned by the ARM architecture processor into X86 instructions, and inputs the X86 instructions into the virtual operating environment, so that the test file running in the virtual operating environment gets feedback from the processor, and the test file is based on the returned X86 instructions can determine the result of calling the API before, and continue to run according to the result of calling the API.

Step 406: The detection device judges whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.

Optionally, in some embodiments, the detection device monitors the API provided by the virtual operating environment. After monitoring that the first API sequence is called, the detection device extracts the behavior characteristics of the test file in the process of calling the first API sequence. The detection device determines whether the behavior characteristics meet the preset conditions. If the behavior characteristic meets the preset condition, the detection device determines that the test file is a malicious file. If the behavior characteristics do not meet the preset conditions, the detection device determines that the test file is not a malicious file, that is, a normal file.

Optionally, in some embodiments, after the detection device starts to run the test file in the virtual operating environment, it only has to convert the API currently called by the test file in the virtual operating environment to the API in the second operating system and perform the second operation. Only when the test file is actually executed in the system can the subsequent call actions be further shown, so as to obtain the complete first API sequence. Then, the detection device determines whether the test file is a malicious file according to the first API sequence.

Optionally, there are multiple implementation ways for the detection device to monitor the API provided by the virtual operating environment. For example, when a software provider designs a program code for implementing a virtual operating environment based on container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.

Take the process of monitoring the first API sequence as an example. For example, when a software provider designs the first API set to support the virtual operating environment, a section for outputting information is embedded in some or all of the APIs in the first API set. code. The function of the program code is to output the related information that the embedded API is called when it is executed. The related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on. Optionally, when designing the first API set, the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.

Optionally, the above program code for outputting information outputs information related to the call of the embedded API to a designated storage space, for example, to a designated log file, so that the detection device can read the data in the designated storage space. , Obtain the behavior characteristics of the PE file calling the first API sequence.

In other words, the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the API in the first operating system API set, and on the other hand, it can output the related information of the called when it is called.

Taking the Windows operating system as the first operating system as an example, for example, the API for writing files in the first API set is WriteFile(), and the WriteFile() has the same name as the file writing API of the Windows operating system. The API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file. The related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.

In other embodiments, the log recording function is preset, and when the function in the first function sequence is called, the log recording function records the event that the function in the first function sequence is called. If the detection device reads the log and finds that each function in the first function sequence is recorded in the log, the detection device determines that the first API sequence is called. For example, the detection device monitors the API for setting the registry provided by the virtual operating environment, that is, monitors the function that implements RegSetValue(). When receiving the event that the RegSetValue() function is called, the detection device determines that the API for setting the registry is called. By analogy, the detection equipment adopts such a monitoring mechanism to monitor each API provided by the virtual operating environment. Or, optionally, the detection device adopts such a monitoring mechanism to monitor key APIs in all APIs provided by the virtual operating environment. For example, the detection device monitors the API used to set the registry, the API used to set the file system, The API used to control file permissions, the API used to manipulate processes, the API used to manipulate threads, and the API used to control memory are monitored.

Optionally, there are multiple implementation manners for extracting behavior characteristics according to the process of calling the first API sequence. For example, the detection device obtains the identifier of each API in the first API sequence, obtains the parameter value passed by the test file to each API, and extracts the behavior characteristics according to the identifier of each API and the parameter value passed by each API.

Behavior characteristics are used to represent one or more dynamic behaviors of the test file. Optionally, the form of the behavior feature is a sequence, and the sequence is formed by sorting the features of each dynamic behavior in the order of occurrence time. The dynamic behavior of the file in the virtual operating environment includes, for example, one or more of process operations, file operations, registry operations, port access, release or loading of DLLs, and so on. Process operations include one or more of creating and ending processes; file operations include one or more of creating files, modifying files, reading files, deleting files, etc.; registry operations include creating registry entries and modifying registry One or more of key, query registry key, delete registry key, etc.

It should be understood that the called APIs and behavior characteristics described above are only examples. If the test file calls other APIs provided by the virtual runtime environment during operation, it will be judged whether the test file is malicious based on the behavior characteristics in the process of calling other APIs. document. For example, optionally, the test file also calls the API for network communication provided by the virtual runtime environment, the API for sending short messages, the API for operating the address book, the API for displaying pop-up windows, etc., provided by the virtual runtime environment during the running process. The behavior characteristic is network communication. Behavior characteristics, behavior characteristics of sending short messages, behavior characteristics of operating the address book, behavior characteristics of displaying pop-up windows, etc. The process of the detection device extracting and calling the behavior characteristics of other APIs for judgment is the same as the method described above, and will not be repeated here.

After the detection device obtains the behavior characteristics of the test file, it confirms whether the test file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned test file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned test file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above-mentioned test file to confirm whether it is a malicious file.

For example, the detection device obtains the behavior characteristics of the test file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C)}. The above behavior characteristics indicate that the test file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information. Based on the series of behavior characteristics, the detection device determines that the test file used as the test file is a malicious file.

Optionally, in some embodiments, different weights are set for different dynamic behaviors, and the weights indicate the degree of threat of the corresponding dynamic behaviors. For example, the greater the weight, the greater the degree of threat. Optionally, after determining the behavior feature, the detection device obtains the weight of the dynamic behavior corresponding to the behavior feature. The detection device determines whether the test file is a malicious test file according to the behavior characteristics and weight.

As APT attacks continue to change, new attack techniques such as shelling, obfuscation, encryption, etc. are also constantly updated. As a result, traditional detection techniques based on malicious signature matching have become increasingly difficult to deal with. From the perspective of the purpose of the malicious file, even if the malicious file is obfuscated or packed, it will trigger the infected machine to perform actions to carry out malicious behaviors such as information theft, infection, or blackmail. Through the above-mentioned implementation manners, the API called by the test file in the virtual operating environment can be used to abstract the dynamic behavior of the test file. If the dynamic behavior of the test file is found to match the dynamic behavior of the malicious file, the test file is determined to be a malicious file. , Thus realizing the dynamic detection of malicious files.

The embodiment of the present application provides a solution that can realize cross-platform dynamic detection of malicious files, and simulates the operating environment provided by an operating system compatible with test files through a virtual operating environment generated based on container technology. After the test file calls the API provided by the virtual operating environment, the testing device converts the API called by the test file from the virtual operating environment to the API provided by the operating system of the testing device, and executes the converted API in the operating system of the testing device. Since the API of the operating system of the detection device is executed, the effect of simulating the execution of the API of the first operating system is achieved. Therefore, the virtual operating environment provided by the testing equipment can be compatible with the normal operation of the test file, thereby getting rid of the dependence of the test file on a specific architecture or platform (that is, the test file requires the testing equipment to be based on a specific architecture or platform), thus achieving cross-platform Malicious file detection. In addition, since the virtual operating environment is generated based on container technology, the container technology can avoid the resource overhead caused by Hypervisor and Guest OS, and directly use the kernel of the host to run. Since the size of the image of the container is much smaller than the size of the image of the virtual machine, the detection method of the embodiment of the present application is lighter, consumes less CPU processing resources, and occupies less memory space. The detection method of the embodiment of the present application realizes the operation of malicious files at the process level, and the detection speed is faster. In addition, the time-consuming and performance overhead caused by repeated resetting of the virtual machine can be avoided, and the overhead caused by operations such as the creation and scheduling of the traditional virtual machine can be avoided.

The following uses the embodiment of FIG. 5 to illustrate the malicious file detection method described in FIG. 4 of the embodiment of the present application. In the embodiment shown in FIG. 5, the detection device is a computer device based on the ARM platform. The test file is a PE file. In other words, the method flow described in FIG. 5 relates to how a detection device based on the ARM platform detects malicious files that use the Windows operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 5 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.

Referring to FIG. 5, FIG. 5 is a flowchart of a malicious file detection method provided by an embodiment of the present application. The method includes the following steps 501 to 505.

Step 501: The detection device obtains a PE file, which is a test file in the embodiment of the present application.

The detection device is a specific case of the detection device in the foregoing embodiment, and the operating system of the detection device is the Linux operating system. The PE file is a specific case of the test file in the foregoing embodiment, the PE file is an executable file running based on the Windows operating system, and the format of the PE file is the PE format.

Step 502: The detection device runs the PE file in the virtual operating environment.

Among them, the virtual operating environment is generated based on container technology. The first API set includes multiple APIs. The first API set includes multiple APIs required for software operation provided by the virtual operating environment. The identifier of the API in the first API set is the same as the identifier of the Windows API in the Windows API set. The Windows API collection includes multiple APIs required by the software provided by the Windows operating system to run the test files. The Windows API collection includes multiple Windows APIs.

Step 503: The detection device obtains a first API sequence called during the running of the PE file, and the first API sequence includes at least one API.

Step 504: The detection device executes a second API sequence in the Linux operating system, the second API sequence includes at least one Linux API, and the first Linux API in the second API sequence has a mapping relationship with the first API in the first API sequence.

In this step, the Linux API is executed to achieve the effect of simulating the execution of the Windows API, thereby simulating the operating environment provided by the Windows system for the test file, and achieving the purpose of operating system simulation. Under the Windows operating system, applications notify the operating system to perform corresponding functions in the form of function calls, and the functions involved in the Windows API call are usually provided by the system's dynamic link library. Correspondingly, optionally, the process of simulating the execution of the Windows API includes the following steps (1) to (3).

Step (1) The detection device obtains the corresponding function from the DLL file according to each API in the first API sequence.

The DLL file is provided by the virtual runtime environment. For example, the dynamic link library of the virtual runtime environment includes DLL files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the PE file running in it. Provide the DLL file so that the PE file can call the first function sequence in the DLL file.

In some embodiments, optionally, the Windows operating system includes multiple DLL files, and the process of accessing the DLL file includes a process of sequentially accessing the multiple DLL files in a certain order. Among them, optionally, the last DLL file accessed is the ntdll.dll file.

Step (2) The detection device obtains the mapped function from the SO file according to each function in the first function sequence and the mapping relationship between the functions.

The SO file is a file containing a dynamic link library in the Linux operating system. The SO file runs on the ARM platform. The SO file is provided by the virtual operating environment. For example, the dynamic link library of the virtual runtime environment includes SO files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the SO file in the image. After the detection device generates an instance of the container based on the image, if the first function sequence is called, The detection device can access the SO file to obtain the second function sequence.

Step (3) In the kernel of the Linux operating system, an operation is performed according to the second function sequence.

Exemplarily, if under the Windows operating system, when the PE file calls API: WriteFile(), it will call the dynamic link library kernel32.dll, and then further call the function NtWriteFile() in the dynamic link library ntdll.dll, and finally The file writing operation is performed in the kernel of the Windows system. In this embodiment, in the process of running the test file under the Linux operating system, when the PE calls the API: WriteFile(), the detection device optionally calls the function frwite() in the SO file. According to the function frwite(), This is done in the Linux kernel.

By calling the SO file of the Linux operating system, the calling process of the DLL file on the Windows operating system can be simulated. By performing operations based on functions in the Linux kernel, the process of performing operations based on functions in the Windows kernel can be simulated. For example, in the example in the previous paragraph, the file writing operation according to frwite() in the kernel of the Linux system is simulated to simulate the file writing operation according to the NtWriteFile() in the Windows kernel of the Windows operating system.

In some embodiments, optionally, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the Linux operating system. Optionally, the instruction conversion process includes the following steps A to B.

Step A. The detection device obtains the first instruction sequence triggered during the running of the test file. Each instruction in the first instruction sequence is an X86 instruction, and each X86 instruction in the first instruction sequence is used to instruct to call the first API sequence. One of the Windows APIs.

Step B: The detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains a second instruction sequence according to the converted ARM instruction, and the second instruction sequence includes at least one ARM instruction. Each ARM instruction in the second instruction sequence is used to instruct to call a Linux API in the second API sequence. Each instruction in the second instruction sequence belongs to the ARM instruction set.

After that, the detection device executes each ARM instruction in the second instruction sequence through the ARM CPU to implement the operation corresponding to each Linux API in the second API sequence, thereby simulating the Windows API through the Linux API.

In some embodiments, optionally, after the detection device executes the second API sequence in the Linux operating system, the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are ARM instructions. The detection device converts each ARM instruction in the third instruction sequence into an X86 instruction, and obtains a fourth instruction sequence according to the converted X86 instruction, and the instructions in the fourth instruction sequence belong to the X86 instructions in the X86 instruction set. The detection device inputs the fourth instruction sequence into the virtual operating environment.

Refer to Figure 3. Based on the above method flow, when an X86 application (APP) is obtained, the X86 APP will trigger a call to the DLL file in the virtual operating environment, generate X86 instructions, and convert X86 instructions to ARM instructions. X86 instructions are executed through an ARM-based operating system. Among them, X86APP is an application developed based on the X86 instruction set, and X86 APP can be packaged as a PE file.

Step 505: The detection device determines whether the PE file is a malicious file based on the behavior characteristics of the PE file in the process in which the first API sequence is called.

Optionally, the malicious file detection method provided in this embodiment is implemented by software alone, for example, all are implemented in the form of a computer program product. The software can be provided to users by software providers. The software provider can be different from the manufacturer of the detection device. For example, the hardware of the detection device is provided by the manufacturer of the network device separately, and the software for detecting malicious files running on the detection device is provided by the service provider of the Internet application separately. When the software provider designs the program code for implementing the virtual operating environment based on the container technology, it adopts a variety of methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.

For example, when a software provider designs the first API set to support the virtual operating environment, a piece of program code for outputting information is embedded in some or all of the APIs in the first API set. The function of the program code is to output the related information that the embedded API is called when it is executed. The related information of the API being called includes, but is not limited to, the identification of the API, the parameters passed in, the time of being called, and so on. Optionally, when designing the first API set, the software provider embeds the above-mentioned program code for outputting information in only part of the APIs that are of interest and that have a better effect on distinguishing normal abnormal behaviors.

In other words, the feature of the API in the first API set is that on the one hand, the identifier of the API is the same as that of the Windows API in the Windows API set, and on the other hand, it can output the related information that is called when it is called.

For example, the API for writing files in the first API set is WriteFile(), which has the same name as the file writing API of the Windows operating system. The API used to write files in the first API set is that when WriteFile() is called, it also outputs the related information of the called to the log file. The related information includes the parameters passed in when WriteFile() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.

After the detection device obtains the behavior characteristics of the PE file in the above manner, it confirms whether the PE file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the above-mentioned PE file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the above-mentioned PE file into a classification model generated by machine learning algorithm training in advance, and then according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above PE file to confirm whether it is a malicious file.

For example, the detection device obtains the behavior characteristics of the PE file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetWindowsHook (parameter C)}. The above behavior characteristics indicate that the PE file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the startup item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetWindowsHook function to intercept user input data , Steal sensitive information. The detection device judges that the PE file as the test file is a malicious file based on the series of behavior characteristics.

It should be understood that the process provided in the embodiment of FIG. 5 is an exemplary illustration of a solution for detecting Windows executable files by a detection device based on a non-X86 platform, and is not the only required implementation for detection devices based on a non-X86 platform to detect Windows executable files. . Optionally, in other embodiments, the Linux operating system of the detection device in the embodiment of FIG. 5 may be replaced with another operating system based on the ARM instruction set architecture. In the case where the Linux operating system in the embodiment of FIG. 5 is replaced with another operating system based on the ARM instruction set architecture, the detection device needs to replace the SO file in the embodiment of FIG. 5 with the dynamic link library of the other operating system. These implementation manners belong to a specific situation of the embodiment in FIG. 4, and should also be covered within the protection scope of the embodiment of the present application.

Currently, there are more and more computer devices based on non-X86 platforms in existing networks, and a large number of test files are still executable files running based on the Windows operating system. When the existing non-X86 platform detection equipment detects the test file under the Windows operating system, the existing technology will have a natural obstacle because the test file cannot be executed.

Through the method provided in the embodiments of this application, the detection device based on the non-X86 platform simulates the operating environment similar to the Windows operating system through the virtual operating environment generated based on the container technology, thereby using the virtual operating environment to be compatible with Windows executable files The normal execution. In this way, even if the test file is an executable file compatible with the Windows operating system, such as a PE file, and the operating system of the detection device is the Linux operating system, the method provided by the embodiments of the present application can be used to dynamically detect the PE under the Linux operating system. Therefore, the testing equipment based on the non-X86 platform can dynamically detect the testing file based on the Windows system, without requiring the testing equipment to be based on the X86 platform compatible with the Windows operating system, so it can overcome The limitation of the use of detection equipment has been improved, and the use scenarios of malicious file detection technology have been expanded.

Hereinafter, the malicious file detection method described in FIG. 4 and FIG. 5 of the embodiment of the present application will be described by using the embodiment of FIG. 6 as an example. In the embodiment shown in FIG. 6, the detection device is a computer device based on the ARM platform. The test file is a Windows EXE file. In other words, the method flow described in FIG. 6 relates to how the detection device based on the ARM platform detects whether the Windows EXE file is a malicious file. The method includes the following steps 601 to 607.

Step 601: The detection device obtains a Windows EXE file as a test file.

Step 602: The detection device obtains multiple Windows APIs called by the Windows EXE file in the virtual running environment.

Step 603: The detection device sequentially accesses the dynamic link library gdi32.dll, user32.dll, or kernel32.dll according to the multiple APIs called in step 602. The dynamic link library gdi32.dll, user32.dll, or kernel32.dll contains the execution functions corresponding to the multiple APIs that are called.

Step 604: The detection device accesses the kernel containing the basic function corresponding to the hit function according to the function hit in gdi32.dll, user32.dll or kernel32.dll (that is, the execution function corresponding to the multiple APIs called in step 603 above) Level dynamic link library ntdll.dll.

Step 605: The system simulation process running by the detection device determines the Linux API of the basic function mapping corresponding to the function hit in step 604 in the ntdll.dll.

Step 606: Detect the access Linux library running on the device, and obtain the function corresponding to the above-mentioned Linux API.

Step 607: The Unix kernel running on the detection device controls the Unix device to perform operations through the Unix device driver according to the function corresponding to the Linux API in step 606.

Hereinafter, the malicious file detection method described in FIG. 6 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 7 as an example. In the embodiment shown in FIG. 7, the test file is malware.exe. Among them, the word Malware comes from the synthesis of the two words Malicious (malicious) and Software (software). It is a term for malicious software and represents software programs that can threaten computers, such as viruses, worms, Trojan horses, and spyware. Wait. Malware.exe is a PE file. Under the X86 platform, during the process of running malware.exe, different DLL files will be called according to different function calls. All DLL calls will eventually be called to the ntdll.dll file, which enters the kernel of the Windows system. The function of the function call. In this embodiment, the detection device based on the ARM platform executes the following steps 701 to 705 to run the malware.exe and detect the malicious files contained in the malware.exe.

Step 701: The detection device starts malware.exe.

For example, a Docker instance includes a system simulation process, for example, a Docker instance is a parent process, and a system simulation process is a child process. The system simulation process is a process used to simulate the operating system of the detection device in the Docker instance. For example, the system simulation process can access the DLL file, obtain the API encapsulated in the DLL file, and perform API conversion.

The Docker instance can start malware.exe through the system simulation process. The Docker instance loads the binary image of malware.exe into the memory space of the detection device through the system simulation process, and starts the binary image. In addition, the system simulation process is used to access the DLL files and SO files required by malware.exe to ensure the normal execution of DLL calls and functions during the running of malware.exe. Among them, the memory space is pre-applied by the Docker instance to the real operating system of the detection device.

Step 702: The detection device calls the function in the DLL file.

Among them, the functions in the DLL file can be used to compose the first API sequence. During the running process of malware.exe, requests for the registry, files, and system IO of the Windows operating system will be generated. These requests are notified to the Windows operating system in the form of function calls. The called functions can be located in the DLL file. Or multiple called functions can form an API.

Optionally, the system simulation process can use the resources of the Linux operating system to simulate the resources of the Windows operating system. For example, the file system of Windows is mapped to a certain directory of Linux, so as to simulate the file system of Windows through the Linux directory. Optionally, the Windows network system is implemented through Linux-based protocol stack simulation.

Step 703: The detection device performs instruction conversion.

The instructions generated by malware.exe running in the Docker instance are still X86 instructions. Optionally, the instruction conversion process converts X86 instructions into ARM instructions.

Step 704: The detection device executes the ARM instruction on the Linux operating system to implement the operation indicated by the ARM instruction.

Optionally, the linux operating system of the detection device executes the ARM instructions through a CPU based on the ARM architecture. In the process of executing the ARM instruction, the CPU can control other computer hardware (such as peripheral input and output devices) to execute the operation corresponding to the ARM instruction. The computer hardware can return the execution result generated by the operation to the Linux operating system, and the Linux operating system returns the execution result to the instruction conversion process. The instruction conversion process converts the execution result from ARM instructions to X86 instructions, and feeds back the X86 instructions to the Docker instance process to ensure that malware.exe continues to execute.

Step 705: The detection device makes a threat judgment based on the dynamic behavior in the calling process.

For example, the detection device monitors the simulated Windows API, abstracts the behavior characteristics of malware.exe based on the called functions and parameters, and performs malicious behavior determination based on the behavior characteristics of malware.exe, thereby completing the dynamic detection of malware.exe.

Through the above process, the detection equipment based on the ARM platform realizes the dynamic detection of the files of the X86 platform on the ARM platform through the mode of operating system simulation and the mode of instruction conversion. In addition, the operation of malicious files can be realized at the process level, avoiding traditional virtual machine creation, scheduling and other operations, occupies less resources, runs fast, and finally achieves the purpose of cross-platform detection of malicious files.

The following illustrates the malicious file detection method described in FIG. 4 of the embodiment of the present application by using the embodiment of FIG. 8 as an example. In the embodiment shown in FIG. 8, the detection device is a computer device based on an X86 platform. The test file is an ELF file. In other words, the method flow described in FIG. 5 relates to how a detection device based on the X86 platform detects malicious files that use the Linux operating system to perform malicious operations. It should be understood that the steps in the embodiment in FIG. 8 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.

Referring to FIG. 8, FIG. 8 is a flowchart of a method for detecting malicious files according to an embodiment of the present application. The method includes the following steps 801 to 805.

Step 801: The detection device obtains an ELF file, which is a test file in the embodiment of the present application.

The detection device is a specific case of the detection device in the foregoing embodiment. Optionally, the operating system of the detection device is a Windows operating system. The ELF file is a specific case of the test file in the above embodiment. ELF files are executable files that run based on the Linux operating system. The format of the ELF file is ELF format.

Step 802: The detection device runs the ELF file in the virtual operating environment.

Among them, the virtual operating environment is generated based on container technology. The first API set includes multiple APIs. The first API set includes multiple APIs required for software operation provided by the virtual operating environment. The identifier of the API in the first API set is the same as the identifier of the Linux API in the Linux API set. The Linux API collection includes multiple APIs required by the Linux operating system for running the software provided by the test file. The Linux API collection includes multiple Linux APIs.

Step 803: The detection device obtains the first API sequence called during the running of the ELF file, the first API sequence includes at least one API, and the APIs in the first API sequence are APIs in the first API set.

Step 804: The detection device executes a second API sequence in the Windows operating system, the second API sequence includes at least one Windows API, and the first Windows in the second API sequence has a mapping relationship with the first API in the first API sequence.

In this step, the Windows API is executed to achieve the effect of simulating the execution of the Linux API, thereby simulating the operating environment provided by the Linux system for the test file, and achieving the purpose of operating system simulation. Optionally, the process of simulating the execution of the Linux API includes the following steps 8041 to 8043.

Step 8041. The detection device obtains the corresponding function from the SO file according to each API in the first API sequence.

The SO file is provided by the virtual operating environment. For example, the dynamic link library of the virtual runtime environment includes SO files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the SO file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it. Provide the SO file so that the ELF file can call the first function sequence in the SO file.

Step 8042. The detection device obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.

The DLL file is provided by the virtual runtime environment. For example, the dynamic link library of the virtual runtime environment includes DLL files. In a possible implementation, in the process of packaging the image of the container, the software provider encapsulates the DLL file in the image, and the detection device generates an instance of the container based on the image, and the instance of the container is the ELF file running in it. Provide the DLL file so that after the first function sequence is called, the detection device can access the DLL file to obtain the second function sequence.

Step 8043: In the kernel of the Windows operating system, perform an operation according to the second function sequence.

Exemplarily, if under the Linux operating system, when the ELF file calls the API: frwite(), the function frwite in the SO file will be called to perform the file writing operation in the kernel of the Linux system. In this embodiment, in the process of running the ELF file under the Windows operating system, when the ELF file calls API: frwite(), the detection device calls kernel32.dll, and then further calls the function NtWriteFile() in ntdll.dll, Finally, the file writing operation is performed in the kernel of the Windows system.

Through the DLL file of the Windows operating system, the calling process of the SO file on the Linux operating system can be simulated, and the process of performing operations according to the function in the Linux kernel can be simulated by performing operations according to functions in the Windows kernel.

In some embodiments, optionally, by performing instruction conversion, the instructions triggered by the test file are converted into instructions executable by the Windows operating system. Optionally, the instruction conversion process includes the following steps one to two.

Step 1. The testing device obtains the first instruction sequence triggered during the running of the test file. Each instruction in the first instruction sequence is an ARM instruction, and each ARM instruction in the first instruction sequence is used to instruct to call the first API sequence One of the Linux APIs.

Step 2: The detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtains a second instruction sequence according to the converted X86 instruction, and the second instruction sequence includes at least one X86 instruction. Each X86 instruction in the second instruction sequence is used to instruct to call a Windows API in the second API sequence. Each instruction in the second instruction sequence belongs to the X86 instruction set.

After that, the detection device executes each X86 instruction in the second instruction sequence through the X86CPU to implement the operation corresponding to each Windows API in the second API sequence, thereby simulating the Linux API through the Windows API.

In some embodiments, optionally, after the detection device executes the second API sequence in the Linux operating system, the detection device obtains a third instruction sequence, where the third instruction sequence represents the result obtained after the second API sequence is executed, and the third The instructions in the instruction sequence are X86 instructions. The detection device converts each X86 instruction in the third instruction sequence into an ARM instruction, and obtains a fourth instruction sequence according to the converted ARM instruction, and the instructions in the fourth instruction sequence belong to the ARM instruction in the ARM instruction set. The detection device inputs the fourth instruction sequence into the virtual operating environment.

For example, based on the above method flow, when the ARM APP is obtained, the ARM APP will trigger the call to the SO file in the virtual operating environment, generate ARM instructions, convert the ARM instructions into X86 instructions, and execute the X86 instructions through the X86-based operating system. Among them, ARM APP is an application developed based on the ARM instruction set. Optionally, the ARM APP is packaged as an ELF file.

Step 805: The detection device judges whether the ELF file is a malicious file based on the behavior characteristics of the ELF file during the calling process of the first API sequence.

Optionally, when the software provider designs the program code for realizing the virtual operating environment based on the container technology, it adopts multiple methods to enable the virtual operating environment to output the behavior characteristics of the test file in the process of calling the first API sequence.

In other words, the feature of the APIs in the first API set is that on the one hand, the identification of the API is the same as the identification of the Linux API in the Linux API set, and on the other hand, it can output the related information that is called when it is called.

For example, the API for writing files in the first API set is frwite(), which has the same name as the file writing API of the Linux operating system. When the API used to write files in the first API set is called frwite(), it also outputs the related information of the called to the log file. The related information includes the parameters passed in when frwite() is called, such as file name, File storage location, the content of the string to be written to the file, the offset of the written data relative to the file header, and so on.

After the detection device obtains the behavior characteristics of the ELF file in the above manner, it confirms whether the ELF file is a malicious file according to the obtained behavior characteristics. For example, the detection device matches the behavioral characteristics of the ELF file with predetermined rules, and confirms whether it is a malicious file according to the matching result; or inputs the behavioral characteristics of the ELF file into a classification model generated by machine learning algorithm training in advance, and according to the output of the classification model Confirm whether it is a malicious file; or manually analyze the behavior characteristics of the above ELF file to confirm whether it is a malicious file.

For example, the detection device obtains the behavior characteristics of the ELF file in the above manner as {RegDeleteValue (parameter A), RegSetValue (parameter B), SetLinuxHook (parameter C)}. The above behavior characteristics indicate that the ELF file first deletes the antivirus software startup item through RegDeleteValue, and then adds itself to the boot-up item through the registry setting function RegSetValue to achieve system resident; and then further sets the global hook through the SetLinuxHook function to intercept user input data , Steal sensitive information. The detection device judges the ELF file as the test file as a malicious file based on the series of behavior characteristics.

For another example, the ELF file calls the file-opening API during the running process, and passes the parameter value representing the Linux kernel symbol table to the system file API, and the behavior characteristics include fopen, proc/kallsyms, and r. Among them, fopen means to open a file, proc/kallsyms and r means the Linux kernel symbol table. Since malicious files usually obtain ROOT permissions by opening the Linux kernel symbol table, when the behavior characteristics include fopen, proc/kallsyms, and r, the ELF file is judged to be a malicious file.

It should be understood that the process provided in the embodiment of FIG. 8 is an exemplary description of a solution for detecting a Linux executable file by a detection device based on the X86 platform, and is not the only required implementation method for a detection device based on the X86 platform to detect a Linux executable file. In other embodiments, optionally, the Windows operating system in the embodiment of FIG. 8 is replaced with another operating system based on the X86 instruction set architecture. When the Windows operating system in the embodiment of FIG. 8 is replaced with another operating system based on the X86 instruction set architecture, the detection device needs to replace the DLL file in the embodiment of FIG. 8 with the dynamic link library of the other operating system. These implementation manners belong to a specific situation of the embodiment in FIG. 4, and should also be covered within the protection scope of the embodiment of the present application.

At present, there are more and more computer devices based on the X86 platform in the existing network, and a large number of test files are still executable files running based on the Linux operating system. When the existing X86 platform detection device detects the test file under the Linux operating system, the existing technology will have a natural obstacle due to the inability to execute the test file.

Through the method provided by the embodiments of this application, the detection device based on the X86 platform can simulate the operating environment similar to that provided by the Linux operating system through the virtual operating environment generated based on container technology, thereby using the virtual operating environment to be compatible with Linux executables The normal execution of the file. In this way, even if the test file is an executable file compatible with the Linux operating system, such as an ELF file, and the operating system of the detection device is a Windows operating system, the ELF file is dynamically detected under the Windows operating system using the method provided by the embodiment of the present application, So as to get rid of the dependence of the detection test file on the Linux operating system, the detection equipment based on the X86 platform can dynamically detect the test file based on the Linux system, without requiring the detection equipment to be based on the ARM platform compatible with the Linux operating system, thus overcoming the detection The limitations of the use of the device have expanded the use scenarios of malicious file detection technology.

In some possible embodiments, optionally, the product form of the foregoing malicious file detection method is a container application, which can provide a function of detecting malicious files. For example, the above software provider may be a cloud computing service provider. For example, a cloud computing service provider provides container applications for enterprise networks. The cloud server deploys the container applications on the enterprise network, and the detection equipment in the enterprise network runs the container The application can implement the method provided in the foregoing embodiment. In addition, optionally, through the Cloud Container Engine (CCE), container clusters are deployed in the enterprise network, and container applications are deployed, managed, expanded, upgraded, uninstalled, expanded, service discovered, and load balanced in the cloud And other life cycle management. The users of the enterprise network can use CCE to conveniently manage the container applications deployed in the enterprise network according to the needs of detecting malicious files.

Hereinafter, the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 9 as an example. In the embodiment shown in FIG. 9, the form of the container is a container application, and the cloud computing service provider obtains the image of the container application by operating the container service management entity. Referring to FIG. 9, the method includes the following steps 901 to 907.

Step 901: The container service management entity creates an image of the container application.

In some possible embodiments, optionally, the container service management entity is a container as a service (container as a service, CaaS) manager. CaaS is a platform as a service (Platform as a Service, PaaS) for providing container services. CaaS is located at the bottom of the PaaS layer and integrates the service capabilities of the PaaS layer and the IaaS layer. For example, optionally, CaaS includes container applications at the PaaS layer and container resources at the IaaS layer. The CaaS manager is an entity used to manage container services in CaaS, and the container service management entity is used to manage and orchestrate CaaS. Of course, the name CaaS Manager is just an example, and other names can also be used to refer to the entity used to manage container services in CaaS.

In some embodiments, optionally, the container service management entity encapsulates the resources of the virtual operating environment based on Docker technology to obtain a Docker image. For example, package various registry, DLL calls, services, etc. to obtain Docker images.

Step 902: The container service management entity sends the image of the container application to the detection device.

Optionally, the detection device receives the image, creates a container application based on the image, and runs the container instance.

In some possible embodiments, optionally, the Docker image library (also called Docker registry) in the Docker technology is used to send the image to the detection device. For example, the container service management entity sends a Docker image to the Docker image library, and the detection device downloads the Docker image from the Docker image library, thereby obtaining the image sent by the container service management entity. For example, the container service management entity sends an instruction (for example, Docker push) through the image, sends the Docker image to the Docker image library, the detection device sends an image download instruction (for example, Docker pull command) to the Docker image library, and the Docker image library responds The image download instruction sends the Docker image to the detection device, thereby deploying the Docker image on the detection device.

The Docker image library is a node device used to store and distribute Docker images in the cluster. The Docker image repository stores a large number of Docker images. Optionally, the Docker image library is implemented based on the Docker registry protocol or the Docker hub protocol. The Docker image library is used to store and distribute Docker images. Optionally, the Docker image is stored in the Docker image station in the form of multiple image layers and one image description information.

Step 903: The detection device obtains the test file.

Step 904: The detection device runs the test file in the virtual operating environment.

Step 905: The detection device obtains the first API sequence called during the running of the test file.

Step 906: The detection device executes the second API sequence in the second operating system.

Step 907: The detection device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.

Referring to Fig. 3, the process of implementing operating system simulation and the process of behavior monitoring described in the embodiment of Fig. 9 is executed in a container application. In some embodiments, optionally, an executable file is started in each container application, and an executable file is detected by each container application. For example, as shown in Figure 3, create three container instances, namely container Docker_1, container Docker_2, and container Docker_n. Start APP1 through the container Docker_1, and detect APP1 in the container Docker_1. Start APP2 through the container Docker_2, and detect APP2 in the container Docker_2. Start APPn through the container Docker_n, and detect APPn in the container Docker_n, so that different test files can be detected in parallel through different containers.

The following describes the malicious file detection method described in FIG. 4 of the embodiment of the present application by using the embodiment of FIG. 10 as an example. In the embodiment shown in FIG. 10, the method for detecting malicious files is applied in the field of network security. The detection device is specifically a network security device such as a firewall, a router, a security gateway, and an intrusion detection device. Network security equipment ensures the security of the network by detecting malicious files spreading on the network. It should be understood that the steps in the embodiment in FIG. 10 are the same as those in the embodiment in FIG. 4, please refer to the embodiment in FIG. 4, and will not be repeated in the embodiment in FIG.

Refer to FIG. 10, which is a flowchart of a network security protection method provided by an embodiment of the present application. The method includes the following steps 1001 to 1007.

Step 1001: The network security device obtains the data stream transmitted in the network.

In this field, a data stream (or message stream) refers to a series of messages from a source host to a destination, where the destination can be another host, a multicast group containing multiple hosts, or a broadcast domain .

Optionally, the network security device is an IDS device, and the network security device obtains the data stream in a bypass mode. That is, the network device does not block the transmission of packets in the network, but uses port mirroring (port mirroring) to copy the packets flowing through the mirrored port to obtain the mirrored packet, and parse the mirrored packet to obtain Test file. Optionally, the network security device is an IPS type device, and the network security device checks each packet passed in real-time through in-line mode, so that when the test file in the packet is a malicious file, the report is blocked. Transmission of text in the network.

Step 1002. The network security device obtains a test file from the data stream. Optionally, after obtaining the load data of each message in the data stream, the network security device reorganizes the load data of all the messages in the data stream according to the sequence number of the message, thereby obtaining the test file.

Step 1003: The network security device runs the test file in the virtual operating environment.

Step 1004: The network security device obtains the first API sequence called during the running of the test file.

Step 1005: The network security device executes the second API sequence in the second operating system.

Step 1006: The network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.

Step 1007: The network security device performs intrusion prevention on the network according to the detection result.

For example, the network security device is an IDS device, and if the network security device determines that the test file is a malicious file, the network security device sends an alarm message. In another example, the network security device is an IPS device. If the network security device determines that the test file is a malicious file, the network security device discards the message, thereby blocking the transmission of the message and sending an alarm message.

Through the method provided in the embodiments of this application, the network security device detects the test file carried in the message by implementing a cross-platform dynamic detection of malicious files, and performs intrusion prevention based on the detection results, and can detect malicious files transmitted in the network in time. Messages to improve the security of the network. In particular, network security devices get rid of the dependence of the detection process on the operating system by means of operating system simulation. When the packets transmitted on the network carry malicious files that use the Windows operating system to perform malicious operations, the virtual operating environment is used to simulate the Windows operating system, and the malicious files are run in the virtual operating environment to detect such malicious packets. When the packets transmitted on the network carry malicious files that use the Linux operating system to perform malicious operations, the virtual operating environment is used to simulate the Linux operating system, and the malicious files are run in the virtual operating environment to detect such malicious packets. Therefore, the network security device can dynamically detect the test files based on the Windows system and the test files based on the Linux system, without requiring that the network security device must be based on the X86 platform compatible with the Windows system or the ARM platform compatible with the Linux system, thus overcoming the network The limitation of the use of security equipment, which greatly expands the application scenarios of the network intrusion prevention method, improves the security of the network system.

Optionally, the network security device provided in the embodiment of FIG. 10 is applied in an enterprise network, and is deployed on the gateway device and cloud platform entrance of the enterprise network. The network security device can dynamically detect malicious behavior by executing the method provided by the embodiment of the application. Documents to provide solutions for the network security of the corporate network.

Figure 11 shows examples of several possible deployment scenarios for network security devices. Illustratively, referring to FIG. 11, the enterprise network includes a headquarters local area network and several local area networks of branch offices. The headquarters LAN includes the data center 1102, the core office area, office area A, and office area B's respective LANs. The respective local area networks of the data center 1102, the core office area, the office area A, and the office area B are connected to the firewall 1105 through a switch. The firewall 1105 is further connected to the wide area network or the Internet through a router 1101, a NAT device (not shown in the figure), a gateway device (not shown in the figure), and so on. The firewall 1105 is used to isolate the headquarters local area network from the wide area network or the Internet, and to protect the data exchanged between the headquarters local area network and the wide area network or the Internet. Optionally, the headquarters local area network is connected to the local area network 1104 of each branch through a VPN, and the branch offices are branch A, branch B, and branch C as shown in FIG. 11.

Optionally, the network security device provided in FIG. 10 is deployed in the enterprise network shown in FIG. 11. For example, referring to FIG. 11, the network security device is a first network security device, a second network security device, a third network security device, or a fourth network security device.

The first network security device is deployed at the network exit of the headquarters LAN, that is, between the firewall 1105 and the router 1101. For example, the first network security device is integrated in an exit firewall, an exit router, or a bypass firewall. The first network security device is used to prevent malicious test files from the Internet and malicious web traffic.

The second network security device is deployed on the border of the data center 1102 of the headquarters LAN. For example, it is an independent device set in a straight way between the data center 1102 and the firewall 1105 to protect the core assets of the server, and to discover the hidden attacks and malicious attacks on the internal network. Scanning, penetration, etc.

The third network security device is deployed on the border of the core department 1103 of the headquarters LAN, for example, a separate device set in a straight path between the switch in the core office area and the firewall 1105 to prevent the transmission of suspicious test files on the intranet and laterally infect the core. Department.

The fourth network security device is deployed on the boundary of the branch LAN 1104, such as a separate device set in a straight path between the branch LAN and the WAN routing device, to avoid malicious test files and unknown threats on the branch LAN and headquarters LAN Random spread between.

The following uses the embodiment of FIG. 12 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 12, the network security device is the first network security device in FIG. 11. The test file is the file carried in the incoming or outgoing packets from the network exit of the headquarters LAN. In other words, the method flow described in FIG. 12 relates to how the network security equipment deployed at the network exit of the headquarters LAN protects the network security of the headquarters LAN. It should be understood that the steps in the embodiment in FIG. 12 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.

Referring to FIG. 12, FIG. 12 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 12, the method may include the following steps 1201 to 1206.

Step 1201. The first network security device collects a message that enters or exits from the network exit of the headquarters LAN, and obtains a test file carried in the message, and the test file is an executable file of the first operating system.

Step 1202: The first network security device runs the test file in the virtual operating environment.

Step 1203: The first network security device obtains the first API sequence called during the running of the test file.

Step 1204: The first network security device executes the second API sequence in the second operating system.

Step 1205: The first network security device judges whether the test file is a malicious file based on the behavioral characteristics of the test file in the process in which the first API sequence is called.

For specific details of step 1202 in the embodiment of FIG. 12, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1203 in the embodiment of FIG. 12, please refer to step 404 in the embodiment of FIG. For specific details of step 1204, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1205 in the embodiment of FIG. 12, please refer to step 406 in the embodiment of FIG.

Step 1206: If the test file is a malicious file, the first network security device reports that malicious traffic is detected at the network exit of the headquarters LAN.

Through the method provided by the embodiments of the present application, network security equipment is deployed at the network exit of the headquarters LAN. The network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files to carry out messages. The test files are tested, and the intrusion prevention is performed based on the test results. Through this method, if the message transmitted from the Internet to the headquarters LAN carries a malicious file that uses the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious file in the virtual operating environment to detect Out such malicious messages. If the message transmitted from the Internet to the headquarters LAN carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method can effectively prevent malicious traffic from the Internet for the headquarters LAN and improve the network security of the headquarters LAN.

Hereinafter, the malicious file detection method described in FIG. 10 of the embodiment of the present application will be illustrated by using the embodiment of FIG. In the embodiment shown in FIG. 13, the network security device is the second network security device in FIG. 11. The test file is the file carried in the packets entering or leaving the boundary of the data center. In other words, the method flow described in FIG. 13 relates to how the network security equipment deployed at the boundary of the data center protects the network security of the data center. It should be understood that the steps in the embodiment in FIG. 13 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.

Referring to FIG. 13, FIG. 13 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 13, the method may include the following steps 1301 to 1306.

Step 1301: The second network security device collects a message entering or exiting from the boundary of the data center, and obtains a test file carried by the message, and the test file is an executable file of the first operating system.

Step 1302: The second network security device runs the test file in the virtual operating environment.

Step 1303: The second network security device obtains the first API sequence called during the running of the test file.

Step 1304: The second network security device executes the second API sequence in the second operating system.

Step 1305: The second network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.

For specific details of step 1302 in the embodiment of FIG. 13, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1303 in the embodiment of FIG. 13 please refer to step 404 in the embodiment of FIG. For specific details of step 1304, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1305 in the embodiment of FIG. 13 please refer to step 406 in the embodiment of FIG.

Step 1306: If the test file is a malicious file, the second network security device reports that malicious traffic is detected at the boundary of the data center.

Through the method provided by the embodiments of the present application, by deploying network security equipment at the border of the data center, the network security equipment collects messages coming in and out of the network exit, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. Through this method, if the message transmitted inside the data center carries malicious files that use the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message. If the message transmitted from the Internet to the data center carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to find malicious files spread in the data center's intranet, helps protect the core assets of the server, and discovers potential attacks, malicious scans, and infiltrations in the intranet.

The following uses the embodiment of FIG. 14 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 14, the network security device is the third network security device in FIG. 11. The test file is the file carried in the message transmitted internally by the core department. In other words, the method flow described in FIG. 14 relates to how the network security equipment deployed at the boundary of the core department protects the network security of the core department. It should be understood that the steps in the embodiment in FIG. 14 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.

Referring to FIG. 14, FIG. 14 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 14, the method may include the following steps 1401 to 1406.

Step 1401. The third network security device collects the message transmitted internally by the core department to obtain a test file carried by the message, and the test file is an executable file of the first operating system.

Step 1402: The third network security device runs the test file in the virtual operating environment.

Step 1403: The third network security device obtains the first API sequence called during the running of the test file.

Step 1404: The third network security device executes the second API sequence in the second operating system.

Step 1405: The third network security device judges whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.

For specific details of step 1402 in the embodiment of FIG. 14, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1403 in the embodiment of FIG. 14, please refer to step 404 in the embodiment of FIG. 4. For specific details of step 1404, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1405 in the embodiment of FIG. 14 please refer to step 406 in the embodiment of FIG.

Step 1406: If the test file is a malicious file, the third network security device reports that malicious traffic is detected in the core department.

Through the method provided by the embodiments of this application, by deploying network security equipment at the border of the core department, the network security equipment collects incoming and outgoing messages from the core department, and implements a cross-platform dynamic detection scheme for malicious files, so as to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. With this method, if the internally transmitted messages of the core department carry malicious files that use the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system, and runs the malicious files in the virtual operating environment to detect This malicious message. If the message transmitted from the Internet to the core department carries malicious files that use the Linux operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Linux operating system, and runs the malicious files in the virtual operating environment to detect this malicious Message. It can be seen that this method helps to discover malicious files spread on the intranet of the core department, and helps prevent the spread of suspicious test files on the intranet and infect the core department horizontally.

The following uses the embodiment of FIG. 15 to illustrate the malicious file detection method described in FIG. 10 of the embodiment of the present application. In the embodiment shown in FIG. 15, the network security device is the fourth network security device in FIG. 11. The test file is the file carried in the incoming or outgoing packets from the boundary of the branch LAN. In other words, the method flow described in FIG. 15 relates to how the network security device deployed at the boundary of the branch local area network protects the network security of the branch local area network. It should be understood that the steps in the embodiment in FIG. 15 are the same as those in the embodiment in FIG. 10, please refer to the embodiment in FIG. 10, and will not be repeated in the embodiment in FIG.

Referring to FIG. 15, FIG. 15 is a flowchart of a network security protection method provided by an embodiment of the present application. As shown in FIG. 15, the method may include the following steps 1501 to 1506.

Step 1501: The fourth network security device collects a message that enters or exits from the boundary of the branch LAN, and obtains a test file carried by the message.

For example, optionally, the message includes the message transmitted within the branch LAN, the message transmitted between the corporate headquarters and the branch LAN, the message flowing from the external network to the branch LAN, or the branch LAN to the external network. At least one of the messages.

Step 1502, the fourth network security device runs the test file in the virtual operating environment.

Step 1503: The fourth network security device obtains the first API sequence called during the running of the test file.

Step 1504: The fourth network security device executes the second API sequence in the second operating system.

Step 1505: The fourth network security device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.

For specific details of step 1502 in the embodiment of FIG. 15, please refer to step 403 in the embodiment of FIG. 4, and for specific details of step 1503 in the embodiment of FIG. 15, please refer to step 404 in the embodiment of FIG. 4. For specific details of step 1504, please refer to step 405 in the embodiment of FIG. 4, and for specific details of step 1505 in the embodiment of FIG. 15 please refer to step 406 in the embodiment of FIG.

Step 1506: If the test file is a malicious file, the fourth network security device reports that malicious traffic is detected at the boundary of the branch LAN.

Through the method provided in the embodiments of the present application, network security equipment is deployed at the boundary of the local area network of the branch. The network security equipment collects incoming and outgoing messages from the boundary, and implements a cross-platform dynamic detection scheme for malicious files to detect the malicious files carried in the messages. Test files are tested, and intrusion prevention is performed based on the test results. Through this method, if the message transmitted from the headquarters LAN to the branch LAN carries a malicious file that uses the Windows operating system to perform malicious operations, the network security device uses the virtual operating environment to simulate the Windows operating system and runs the malicious file in the virtual operating environment. This malicious message is detected. If the message transmitted from the headquarters LAN to the branch LAN carries a malicious file that uses the Linux operating system to perform malicious operations, the network security device uses a virtual operating environment to simulate the Linux operating system, and runs the malicious file in the virtual operating environment to detect this. A malicious message. It can be seen that this method can effectively prevent malicious traffic from the headquarters LAN for the branch LAN, avoid malicious test files and unknown threats from spreading between the branch LAN and the headquarters LAN, and improve the network security of the branch LAN.

In some possible embodiments, optionally, the above method embodiments are applied in a virtualization architecture, and the execution subject of the method embodiments is an entity corresponding to a network element in the virtualization architecture.

For example, the virtualization architecture is the NFV architecture.

Referring to Figure 16, the NFV architecture includes NFV MANO and VNF. NFV MANO has three main functional blocks, namely NFV orchestrator, VNF manager, and virtualized infrastructure manager (VIM). Simply put, the NFV orchestrator can orchestrate services and resources, control new network services and integrate VNFs into the virtual architecture. The NFV orchestrator can also verify and authorize resource requests from the NFV infrastructure. The VNF manager can manage the life cycle of the VNF. VIM can control and manage NFV infrastructure, including computing resources, storage resources, and network resources. In order for NFV MANO to be effective, it must be integrated with the API in the existing system to use technologies from multiple vendors across multiple network domains. Similarly, the operator’s operation support system (OSS) and business The business support system (BSS) also needs to interoperate with the NFV MANO system.

For example, optionally, the function of each component in FIG. 16 is as follows.

Network function virtualization orchestrator (NFVO) is used to realize the management and processing of network service descriptor (NSD) and virtual network function forwarding graph (VNFFG), The management of the life cycle of network services, and the coordination of virtual network function manager (VNFM) to realize the management of the life cycle of virtual network function (VNF) and the global view function of virtual resources .

VNFM is used to manage the life cycle of VNF, including VNF descriptor (VNF descriptor, VNFD) management, VNF instantiation, and elastic scaling of VNF instances (for example, scaling out/up, and/or scaling out) in/down), healing of VNF instances and termination of VNF instances. VNFM also supports receiving elastic scaling (scaling) policies issued by NFVO to realize automated VNF elastic scaling.

The virtualized infrastructure manager (VIM) is mainly responsible for the management (including reservation and allocation) of hardware resources and virtualized resources of the infrastructure layer, as well as the monitoring and fault reporting of virtual resource status, and provides virtualized resources for upper-layer applications. Resource pool.

Operation and business support systems (OSS/BSS) refer to the existing operation and maintenance systems of operators.

The element manager (EM) performs traditional fault, configuration, user, performance, and security management (fault management, configuration management, account management, performance management, security management, FCAPS) functions for the VNF.

The virtualized network function (VNF) corresponds to the physical network function (PNF) in the traditional non-virtualized network, for example, the mobility of the virtualized evolved packet core (EPC) Management entity (mobility management entity, MME), service gateway (service gateway, SGW), packet data gateway (packet data network gateway, PGW) and other nodes. The functional behavior and status of network functions have nothing to do with virtualization or not. NFV technical requirements hope that VNF and PNF have the same functional behavior and external interface. Wherein, optionally, the VNF includes one or more VNF components (virtual network function component, VNFC) of a lower functional level.

NFV infrastructure (NFV infrastructure, NFVI): including hardware resources, virtual resources and virtualization layer. From the perspective of VNF, the virtualization layer and hardware resources appear to be a complete entity that can provide the required virtual resources.

In some embodiments, optionally, the hardware resource of the NFVI is a heterogeneous system. The heterogeneous system includes hardware using different types of instruction sets and architectures. The hardware includes computing hardware, storage hardware, network hardware, and the like. For example, as shown in Figure 16, the heterogeneous system includes X86 CPU and ARM CPU. The virtualization layer of NFVI is used to implement the function of operating system simulation and the function of instruction conversion described in the foregoing method embodiment. The virtual resource of the NFVI includes a container, which is used to provide the virtual operating environment described in the above method embodiment, and the container is provided as a VNF.

Hereinafter, the malicious file detection method described in FIG. 4 of the embodiment of the present application will be illustrated by using the embodiment of FIG. 17 as an example. In the embodiment shown in FIG. 17, the malicious file detection method is applied in the NFV architecture, and the detection device is a VNF. The software provider issues test files to the testing equipment through NFVO, VNFM or VIM in NFV MANO, and the testing results of the testing equipment on the test files can be returned to NFV MANO.

Optionally, if the malicious file detection method is implemented by a container, the VNF runs the container to perform malicious detection, and the container is delivered to the VNF by the NFV MANO. Optionally, a CaaS manager is deployed in the NFV MANO, and the CaaS manager delivers the container to the VNF. Optionally, other network elements in the NFV MANO deliver the container to the VNF. In this way, a containerized VNF is realized, and the containerized VNF performs malicious file detection on the test file by running the container. Among them, a containerized VNF refers to a VNF created on a container, and examples of the containerized VNF include one or more VNFC instances. Optionally, one VNFC is mapped to one container application in the CaaS service, or one VNF is mapped to one container application in the CaaS service.

The following describes the process of the method for detecting malicious files based on the NFV architecture with reference to FIG. 17. The method may include the following steps 1701 to 1706.

Step 1701, NFV MANO sends a test file to the VNF, where the test file is an executable file of the first operating system.

Step 1702, the VNF receives the test file, and runs the test file in the virtual operating environment.

Step 1703: The VNF obtains the first API sequence called during the running of the test file.

Step 1704: The VNF executes the second API sequence in the second operating system.

Step 1705: The VNF determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.

Step 1706: The VNF sends the detection result to the NFV MANO.

In the NFV architecture, the functions of each network element are usually no longer dependent on dedicated hardware. Instead, each network element in the telecommunications network is virtualized into software, and each software is deployed on common hardware, so as to realize the software and Decoupling of hardware.

Through the method provided by the embodiments of the present application, since the detection function of malicious files is free from dependence on specific hardware, it is not necessary to use dedicated hardware to implement the detection process of malicious files. Therefore, it just meets the fundamental goal of decoupling software and hardware in NFV. The function of detecting malicious files is virtualized as a VNF and applied under the virtualized architecture of NFV, thereby expanding the scenario of malicious file detection for NFV applications.

The malicious file detection method of the embodiment of the present application is introduced above, and the malicious file detection device of the embodiment of the present application is introduced below. It should be understood that the detection device applied to the malicious file has any function of the execution subject of the above method embodiment. .

FIG. 18 is a schematic structural diagram of a malicious file detection device provided by an embodiment of the present application. As shown in FIG. 18, the malicious file detection device includes an acquisition module 1801, an operation module 1802, an execution module 1803, and a judgment module 1804.

The obtaining module 1801 is used to obtain a test file, for example, it can be used to execute step 402, step 501, step 801, step 903, step 1002, step 1201, step 1301, step 1401, step 1501, or step 1702 in the above method embodiment ；

The running module 1802 is used to execute the running test file. For example, it can be used to execute step 403, step 502, step 802, step 904, step 1003, step 1202, step 1302, step 1402, step 1502 or step in the above method embodiment 1703;

The obtaining module 1801 is also used to obtain the first API sequence. For example, it can be used to execute step 404, step 503, step 803, step 905, step 1004, step 1203, step 1303, step 1403, and step 1503 in the above method embodiment. Or step 1703;

The execution module 1803 is used to execute the second API sequence. For example, it can be used to execute step 405, step 504, step 804, step 906, step 1005, step 1204, step 1304, step 1404, step 1504 or Step 1704;

The judging module 1804 is used to judge whether the test file is a malicious file. For example, it can be used to execute step 406, step 505, step 805, step 907, step 1006, step 1205, step 1305, step 1405, step 1505 or step 1705.

Optionally, the execution module 1803 is configured to execute step one to step three in step 405.

Optionally, the execution module 1803 is configured to execute step (1) to step (3) in step 504.

Optionally, the execution module 1803 is configured to execute step 8041 to step 8043.

Optionally, the execution module 1803 is configured to execute step a to step b in step 405.

It should be understood that the device for detecting malicious files provided in the embodiment of FIG. 18 corresponds to the device for detecting malicious files in the foregoing method embodiments. The modules in the device for detecting malicious files and the other operations and/or functions described above are used to implement the method. For the various steps and methods implemented by the malicious file detection device in the example, for specific details, please refer to the foregoing method embodiment, and for brevity, details are not repeated here.

It should be understood that when the device for detecting malicious files provided in the embodiment of FIG. 18 detects malicious files, only the division of the above-mentioned functional modules is used as an example. In actual applications, the above-mentioned functions can be allocated by different functional modules as required. , That is, divide the internal structure of the malicious file detection device into different functional modules to complete all or part of the functions described above. In addition, the malicious file detection apparatus provided in the foregoing embodiment belongs to the same concept as the foregoing malicious file detection method embodiment. For the specific implementation process, please refer to the method embodiment, which will not be repeated here.

The embodiments of the present application also provide a computer program product, which when the computer program product runs on a detection device, causes the detection device to execute the malicious file detection method provided in the foregoing method embodiment.

The embodiment of the present application also provides a chip, which when the chip runs on a detection device, causes the detection device to execute the malicious file detection method provided by the foregoing method embodiment. The chip may be a general-purpose processor, the general-purpose processor includes a processing circuit and an input interface and an output interface that are internally connected and communicated with the processing circuit, and the processing circuit is used to execute the steps of obtaining the test file in the above-mentioned various method embodiments through the input interface The processing circuit is used to execute the steps of running the test file, acquiring the first API sequence, executing the second API sequence, and judging whether the test file is a malicious file in the foregoing method embodiments. Optionally, the general-purpose processor may further include a storage medium, and the processing circuit is configured to execute the storage steps in each of the foregoing method embodiments through the storage medium. The storage medium may store instructions executed by the processing circuit, and the processing circuit is configured to execute the instructions stored in the storage medium to execute the foregoing method embodiments.

A person of ordinary skill in the art may realize that, in combination with the method steps and units described in the embodiments disclosed herein, they can be implemented by electronic hardware, computer software, or a combination of both, in order to clearly illustrate the possibilities of hardware and software. Interchangeability, in the above description, the steps and components of the embodiments have been generally described in accordance with their functions. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. A person of ordinary skill in the art may use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present application.

Those skilled in the art can clearly understand that, for the convenience and conciseness of description, the specific working process of the above-described system, device, and unit can be referred to the corresponding process in the foregoing method embodiment, which will not be repeated here.

In the several embodiments provided in this application, it should be understood that the disclosed system, device, and method can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the unit is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or may be Integrate into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may also be electrical, mechanical or other forms of connection.

The unit described as a separate component may or may not be physically separated, and the component displayed as a unit may or may not be a physical unit, that is, it may be located in one place, or may also be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments of the present application.

In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.

If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially or the part that contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium. It includes several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods in the various embodiments of the present application. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program code .

The above descriptions are only specific implementations of this application, but the protection scope of this application is not limited to this. Any person skilled in the art can easily think of various equivalent modifications within the technical scope disclosed in this application. Or replacement, these modifications or replacements should be covered within the scope of protection of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented by software, it can be implemented in the form of a computer program product in whole or in part. The computer program product includes one or more computer program instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions in the embodiments of the present application are generated in whole or in part. The computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices. The computer instructions can be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium. For example, the computer program instructions can be passed from a website, computer, server, or data center. Wired or wireless transmission to another website site, computer, server or data center. The computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center integrated with one or more available media. The usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a digital video disc (DVD), or a semiconductor medium (for example, a solid state hard disk).

Those of ordinary skill in the art can understand that all or part of the steps in the foregoing embodiments can be implemented by hardware, or by a program instructing related hardware to be completed. The program can be stored in a computer-readable storage medium, as mentioned above. The storage medium can be read-only memory, magnetic disk or optical disk, etc.

The above descriptions are only optional embodiments of this application and are not intended to limit this application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application shall be included in the protection scope of this application within.

Claims

A method for detecting malicious files, characterized in that the method includes:

The detection device obtains a test file, where the test file is an executable file running based on the first operating system;

The detection device runs the test file in a virtual operating environment, and the virtual operating environment is generated based on container technology;

The detection device obtains a first API sequence called during the running of the test file, the first API sequence includes at least one API, and the API included in the first API sequence is an API in a first API set, The first API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second The API set includes multiple APIs required for software operation provided by the first operating system;

The detection device executes a second API sequence in a second operating system, the second API sequence includes at least one API, the API included in the second API sequence is an API in the second operating system, and The first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is an operating system based on the computer instruction set architecture of the detection device;

The detection device determines whether the test file is a malicious file based on the behavior characteristics of the test file in the process in which the first API sequence is called.
The method according to claim 1, wherein the execution of the second API sequence in the second operating system by the detection device comprises:

The detection device obtains the corresponding function from the dynamic link library of the virtual operating environment according to each API in the first API sequence, thereby obtaining the first function sequence, and the functions included in the first function sequence Used to implement the API included in the first API sequence;

The detection device obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, thereby generating a second function sequence. The second function sequence includes The function is used to implement the API included in the second API sequence, and the first function in the second function sequence has a mapping relationship with the first function in the first function sequence;

The detection device is in the kernel of the second operating system and performs operations according to the second function sequence.
The method according to claim 2, wherein the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the detection device is based on each of the first API sequence APIs respectively obtain corresponding functions from the dynamic link library of the virtual operating environment, including:

The detection device obtains a corresponding function from a dynamic link library DLL file according to each API in the first API sequence;

The detection device separately obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, including:

The detection device separately obtains the mapped function from the shared object SO file according to each function in the first function sequence and the mapping relationship between the functions.
The method according to claim 2, wherein the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the detection device is based on each of the first API sequence APIs respectively obtain corresponding functions from the dynamic link library of the virtual operating environment, including:

The detection device obtains the corresponding function from the SO file according to each API in the first API sequence;

The detection device separately obtains the mapped function from the dynamic link library of the second operating system according to each function in the first function sequence, including:

The detection device separately obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
The method according to any one of claims 1 to 4, wherein the execution of the second API sequence in the second operating system by the detection device comprises:

Acquiring, by the detection device, a first-type parameter called in the first API sequence, and the parameters included in the first-type parameter are input parameters of the API in the first API sequence;

The detection device executes the second API sequence according to the second type of parameters in the second operating system, and the parameters included in the second type of parameters are the input parameters of the API in the second API sequence, so The first parameter in the second type of parameters has a mapping relationship with the first parameter in the first type of parameters.
The method according to any one of claims 1 to 5, wherein the execution of the second API sequence in the second operating system by the detection device comprises:

The detection device acquires a first instruction sequence triggered during the running of the test file, the first instruction sequence includes at least one instruction, and each instruction in the first instruction sequence is used to instruct to call the first instruction sequence. An API in the API sequence;

The detection device performs a first instruction conversion on the instructions in the first instruction sequence, and obtains a second instruction sequence according to the result of the first instruction conversion, the second instruction sequence includes at least one instruction, and the second instruction sequence Each instruction in is used to instruct to call an API in the second API sequence, and the first instruction conversion is used to convert the instructions in the instruction set based on the first operating system into the computer of the detection device Instructions in the instruction set;

The detection device executes the second instruction sequence to implement operations corresponding to the second API sequence.
The method according to claim 6, wherein the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the detection device supports the The instructions in the first instruction sequence perform the first instruction conversion, and obtain the second instruction sequence according to the result of the first instruction conversion, including:

The detection device converts each X86 instruction in the first instruction sequence into an ARM instruction, and obtains the second instruction sequence according to the converted ARM instruction.
The method according to claim 6, wherein the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the detection device responds to instructions in the first instruction sequence The instruction performs the first instruction conversion, and obtains the second instruction sequence according to the result of the first instruction conversion, including:

The detection device converts each ARM instruction in the first instruction sequence into an X86 instruction, and obtains the second instruction sequence according to the converted X86 instruction.
The method according to claim 1, wherein after the detection device executes the second API sequence in the second operating system, the method further comprises:

Acquiring, by the detection device, a third instruction sequence, where the third instruction sequence represents a result obtained after executing the second API sequence, and the instructions in the third instruction sequence belong to the computer instruction set of the detection device;

The detection device performs a second instruction conversion on each instruction in the third instruction sequence, and obtains a fourth instruction sequence according to the result of the second instruction conversion, and the instructions in the fourth instruction sequence belong to the virtual operating environment The second instruction conversion is used to convert instructions in the computer instruction set of the detection device into instructions in the instruction set on which the first operating system is based;

The detection device inputs the fourth instruction sequence into the virtual operating environment.
The method according to any one of claims 1 to 9, wherein the virtual operating environment is generated based on a container image, and the image encapsulates the first API set.
The method according to any one of claims 1 to 10, wherein the container technology includes Docker container technology, the virtual operating environment is started by a Docker daemon, and the Docker daemon is the detection device based on The process run by the second operating system.
A detection device for malicious files, characterized in that the device includes:

An obtaining module for obtaining a test file, the test file being an executable file running based on the first operating system;

The running module is used to run the test file in a virtual running environment, the virtual running environment is generated based on container technology; to obtain the first API sequence called by the test file during the running process, the first API sequence The first API sequence includes at least one API, the API included in the first API sequence is an API in a first API set, and the first API set includes multiple APIs required by the software provided by the virtual operating environment to run. The identifiers of APIs in an API set are the same as those of APIs in a second API set, and the second API set includes multiple APIs required for software operation provided by the first operating system;

The execution module is used to execute a second API sequence in a second operating system, the second API sequence includes at least one API, and the API included in the second API sequence is an API in the second operating system, so The first API in the second API sequence has a mapping relationship with the first API in the first API sequence, and the second operating system is an operating system based on a computer instruction set architecture of the detection device;

The judging module is used to judge whether the test file is a malicious file based on the behavior characteristics of the test file during the calling process of the first API sequence.
The apparatus according to claim 12, wherein the execution module is configured to obtain corresponding functions from the dynamic link library of the virtual runtime environment according to each API in the first API sequence, Thereby, a first function sequence is obtained, and the functions included in the first function sequence are used to implement the API included in the first API sequence; according to each function in the first function sequence, the second operation The mapped function is acquired from the dynamic link library of the system to generate a second function sequence. The functions included in the second function sequence are used to implement the API included in the second API sequence. A function has a mapping relationship with the first function in the first function sequence; in the kernel of the second operating system, operations are performed according to the second function sequence.
The device according to claim 13, wherein the first operating system is a Windows operating system, the second operating system is a Linux operating system, and the execution module is configured to execute according to the first API sequence Each of the APIs obtains the corresponding function from the dynamic link library DLL file, and obtains the mapped function from the shared object SO file according to each function in the first function sequence and the mapping relationship between the functions.
The apparatus according to claim 13, wherein the first operating system is a Linux operating system, the second operating system is a Windows operating system, and the execution module is configured to perform according to the first API sequence Each of the APIs obtains the corresponding function from the SO file, and obtains the mapped function from the DLL file according to each function in the first function sequence and the mapping relationship between the functions.
The device according to any one of claims 12 to 15, wherein the execution module is configured to obtain the first type of parameters called in the first API sequence, and the parameters included in the first type of parameters Is the input parameter of the API in the first API sequence; in the second operating system, the second API sequence is executed according to the second type of parameters, and the parameters included in the second type of parameters are the second For the input parameters of the API in the API sequence, the first parameter in the second type of parameter has a mapping relationship with the first parameter in the first type of parameter.
The device according to any one of claims 12 to 16, wherein the execution module is configured to obtain a first instruction sequence triggered during the running of the test file, and the first instruction sequence includes at least An instruction, each instruction in the first instruction sequence is used to instruct to call an API in the first API sequence; perform a first instruction conversion on the instructions in the first instruction sequence, and convert according to the first instruction A second instruction sequence is obtained as a result of the second instruction sequence, the second instruction sequence includes at least one instruction, and each instruction in the second instruction sequence is used to instruct to call an API in the second API sequence, and the first instruction The conversion is used to convert the instructions in the instruction set based on the first operating system into the instructions in the computer instruction set of the detection device; execute the second instruction sequence to implement the operation corresponding to the second API sequence.
The apparatus according to claim 17, wherein the first operating system is a Windows operating system, the computer instruction set architecture of the detection device is an advanced reduced instruction set machine ARM architecture, and the execution module is used for Each X86 instruction in the first instruction sequence is converted into an ARM instruction, and the second instruction sequence is obtained according to the converted ARM instruction.
The apparatus according to claim 17, wherein the first operating system is a Linux operating system, the computer instruction set architecture of the detection device is an X86 architecture, and the execution module is configured to transfer the first instruction Each ARM instruction in the sequence is converted into an X86 instruction, and the second instruction sequence is obtained according to the converted X86 instruction.
The device according to claim 17, wherein the execution module is configured to obtain a third instruction sequence, and the third instruction sequence represents a result obtained after executing the second API sequence, and the third instruction The instructions in the sequence belong to the computer instruction set of the detection device; a second instruction conversion is performed on each instruction in the third instruction sequence, and the fourth instruction sequence is obtained according to the result of the second instruction conversion. The instructions in the sequence belong to the computer instruction set of the virtual operating environment, and the second instruction conversion is used to convert the instructions in the computer instruction set of the detection device into the instructions in the instruction set on which the first operating system is based; The fourth instruction sequence is input into the virtual operating environment.
A detection device, characterized by comprising a network interface, a memory, and a processor connected to the memory,

The network interface is used to obtain a test file, and the test file is an executable file running based on a first operating system;

The memory is used to store program instructions;

The processor is configured to execute the program instructions, so that the detection device performs the following operations:

Running the test file in a virtual operating environment, the virtual operating environment being generated based on container technology;

Obtain the first API sequence called during the running of the test file, the first API sequence includes at least one API, the API included in the first API sequence is the API in the first API set, and the first API sequence The API set includes multiple APIs required for software operation provided by the virtual operating environment, the identifiers of the APIs in the first API set are the same as those of the APIs in the second API set, and the second API set includes all APIs. Describe multiple APIs required for software operation provided by the first operating system;

A second API sequence is executed in a second operating system, the second API sequence includes at least one API, the API included in the second API sequence is an API in the second operating system, and the second API sequence The first API in and the first API in the first API sequence have a mapping relationship, and the second operating system is an operating system based on the computer instruction set architecture of the detection device;

Based on the behavior characteristics of the test file when the first API sequence is called, it is determined whether the test file is a malicious file.
A computer-readable storage medium, characterized in that at least one instruction is stored in the storage medium, and the instruction is read by a processor to make a detection device execute any one of claims 1 to 11 Methods.