WO2024067479A1 - Container escape detection method, electronic device, and system - Google Patents

Container escape detection method, electronic device, and system Download PDF

Info

Publication number
WO2024067479A1
WO2024067479A1 PCT/CN2023/121088 CN2023121088W WO2024067479A1 WO 2024067479 A1 WO2024067479 A1 WO 2024067479A1 CN 2023121088 W CN2023121088 W CN 2023121088W WO 2024067479 A1 WO2024067479 A1 WO 2024067479A1
Authority
WO
WIPO (PCT)
Prior art keywords
namespace
address
host machine
identifier
cache area
Prior art date
Application number
PCT/CN2023/121088
Other languages
French (fr)
Chinese (zh)
Inventor
陈念
季冬
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024067479A1 publication Critical patent/WO2024067479A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/52Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow
    • G06F21/53Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems during program execution, e.g. stack integrity ; Preventing unwanted data erasure; Buffer overflow by executing in a restricted environment, e.g. sandbox or secure virtual machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45562Creating, deleting, cloning virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45587Isolation or security of virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45591Monitoring or debugging support

Definitions

  • Embodiments of the present application relate to virtualization technology, and more particularly to a method, electronic device, and system for detecting container escape.
  • Container technology is a technology that packages applications into separate containers. It isolates each application and breaks the dependencies and connections between programs. In other words, with the support of container technology, a large service system can be composed of containers hosted by many different applications. Container technology effectively divides resources managed by a single operating system into isolated groups to better balance conflicting resource usage requirements between isolated groups. It is an operating system-level virtualization technology and is widely used due to its lightweight characteristics.
  • Container escape Users can create and run containers based on container images in a host machine, where the host machine can be a physical machine or a virtual machine.
  • Each container has an independent process running space.
  • processes in a container can only run in the process running space of the container.
  • the malicious process is likely to escape from the process running space of the container and then attack the host machine or other containers. This phenomenon is called container escape.
  • container escapes can be divided into the following four types: (1) container escapes caused by insecure configuration; (2) container escapes caused by insecure mounting; (3) container escapes caused by related program vulnerabilities; and (4) container escapes caused by kernel vulnerabilities. How to detect container escapes caused by kernel vulnerabilities is an urgent problem that the industry needs to solve.
  • the present application provides a method, electronic device and system for detecting container escape.
  • the host machine can prompt an alarm message when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located.
  • the alarm message is used to prompt the process that the container escape has occurred.
  • the method has high accuracy and can effectively detect container escape.
  • an embodiment of the present application provides a method for detecting container escape, the method comprising:
  • the host determines that the namespace where the process is located is the initial namespace, it obtains the address of the cache area to which the process belongs;
  • the namespace where the process is located is the initial namespace, which means that the process can access the namespace indicated by init_nsproxy, in order to ensure the security of the namespace; therefore, the host can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, resulting in a container escape.
  • This method has high accuracy and can effectively detect container escapes.
  • the method includes:
  • the host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
  • the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
  • obtaining the address of the cache area to which the process belongs includes:
  • the host machine determines that the namespace where the process is located is the initial namespace, it obtains the user ID of the process;
  • the host machine determines that the user identifier is the identifier of the root user, it obtains the address of the cache area to which the process belongs.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
  • Get the address of the cache area to which the process belongs including:
  • the host determines that the namespace where the process is located is the initial namespace, it obtains the level of the namespace where the process is located;
  • the host determines that the level is zero, it obtains the address of the cache area to which the process belongs.
  • the method before obtaining the address of the cache area to which the process belongs, the method further includes:
  • the host obtains the data structure of the process; the data structure includes the identifier of the namespace where the process is located;
  • the host determines that the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace, it determines that the namespace where the process is located is the initial namespace.
  • an embodiment of the present application provides a method for detecting container escape, the method comprising:
  • the host determines that the mount point of the process is the root directory, it obtains the address of the cache area to which the process belongs;
  • the mount point of the process is the root directory, which means that the process can access all files under the root directory.
  • the process needs to be detected; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since the address of the cache area to which the process belongs is more credible, the address of the cache area to which the process belongs is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it is considered that the process originally did not have the authority to access all files under the root directory, and the process has been maliciously tampered with, resulting in a container escape.
  • the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the method includes:
  • the host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
  • obtaining the address of the cache area to which the process belongs includes:
  • the host machine determines that the namespace where the process is located is the initial namespace, it obtains the user ID of the process;
  • the host machine determines that the user identifier of the process is the identifier of the root user, it obtains the address of the cache area to which the process belongs.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
  • obtaining the address of the cache area to which the process belongs includes:
  • the host determines that the namespace where the process is located is the initial namespace, it obtains the level of the namespace where the process is located;
  • the host determines that the level is zero, it obtains the address of the cache area to which the process belongs.
  • the method before obtaining the address of the cache area to which the process belongs, the method further includes:
  • the host obtains the data structure of the process;
  • the data structure includes the mount point identifier of the process;
  • the mount point identifier of the host is a root directory identifier
  • the mount point of the process is determined to be the root directory.
  • an embodiment of the present application provides a device for detecting a container escape, the device comprising:
  • An acquisition unit used for acquiring an address of a cache area to which the process belongs when it is determined that the namespace where the process is located is the initial namespace;
  • the prompt unit is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
  • the namespace where the process is located is the initial namespace, that is, the namespace that the process can indicate by init_nsproxy, in order to ensure the security of the namespace; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, and the container escape occurs.
  • This method has high accuracy and can effectively detect container escapes.
  • the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the acquiring unit is used to:
  • the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
  • An acquisition unit used for acquiring a user ID of the process when it is determined that the namespace where the process is located is the initial namespace
  • the acquisition unit is used to acquire the address of the cache area to which the process belongs when it is determined that the user identifier is the identifier of the root user.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
  • the acquiring unit is configured to:
  • the level of the namespace where the process is located is obtained
  • the device further includes a determining unit,
  • An acquisition unit used for acquiring a data structure of a process; the data structure includes an identifier of a namespace where the process is located;
  • the determination unit is used to determine that the namespace where the process is located is the initial namespace when the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace.
  • an embodiment of the present application provides a device for detecting a container escape, the device comprising:
  • An acquisition unit used for acquiring an address of a cache area to which the process belongs when it is determined that the mount point of the process is a root directory
  • the prompt unit is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
  • the namespace nsproxy where the process is located is the initial namespace init_nsproxy, which means that the process can be indicated by init_nsproxy.
  • the host can detect the process based on the address slab_cache of the cache area to which the process belongs. Since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, resulting in a container escape. This method has high accuracy and can effectively detect container escapes.
  • the acquiring unit is used to:
  • the acquiring unit is used to:
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
  • the acquiring unit is used to:
  • the level of the namespace where the process is located is obtained
  • the device further includes a determining unit
  • An acquisition unit used for acquiring a data structure of a process; the data structure includes a mount point identifier of the process;
  • the determination unit is used to determine that the mount point of the process is the root directory when the mount point identifier is the root directory identifier.
  • an embodiment of the present application provides a method for detecting container escape, the method comprising:
  • the host determines that the mount point of the process is the root directory, it obtains the target data from the data structure of the process;
  • the host When the target data meets the preset conditions, the host prompts an alarm message, which is used to prompt the process that the container escapes.
  • a mount point-based detection mechanism is added to the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the target data is a level of the namespace in which it is located; when the target data meets a preset condition, the host machine prompts an alarm message, including:
  • the host determines that the level of the namespace where the process is located is zero, it prompts a warning message.
  • the target data is a user identifier of the process; the host machine When setting conditions, warning information is prompted, including:
  • the host machine determines that the user ID of the process is the ID of the root user, it prompts a warning message.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
  • the method further includes:
  • the host obtains the data structure of the process;
  • the data structure includes the mount point identifier of the process;
  • the mount point identifier of the host is a root directory identifier
  • the mount point of the process is determined to be the root directory.
  • an embodiment of the present application provides a device for detecting a container escape, the device comprising:
  • An acquisition unit used for acquiring target data from a data structure of the process when it is determined that the mount point of the process is a root directory
  • the prompt unit is used for the host machine to prompt an alarm message when the target data meets the preset conditions.
  • the alarm message is used to prompt the process that the container escape occurs.
  • a mount point-based detection mechanism is added to the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the target data is a level of the namespace in which it is located; and the prompt unit is used to:
  • the target data is a user identifier of the process; and the prompting unit is used to:
  • the host machine determines that the user ID of the process is the ID of the root user, it prompts a warning message.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
  • the apparatus further includes a determining unit:
  • An acquisition unit used for acquiring a data structure of a process; the data structure includes a mount point identifier of the process;
  • the determination unit is used to determine that the mount point of the process is the root directory when the mount point identifier of the host machine is the root directory identifier.
  • an embodiment of the present application provides an electronic device, comprising one or more functional modules, which can be used to execute a method for detecting container escape as in any possible implementation of the first aspect described above.
  • the present application provides a computer storage medium, comprising computer instructions, which, when executed on an electronic device, causes a communication device to execute a method for detecting container escape in any possible implementation of any of the above aspects.
  • the present application provides a computer program product, which, when executed on a computer, enables the computer to execute a method for detecting container escape in any possible implementation of any of the above aspects.
  • the present application provides a chip, comprising: a processor and an interface, wherein the processor and the interface cooperate with each other so that the chip executes the container escape detection method in any possible implementation of any of the above aspects.
  • the electronic device provided in the seventh aspect, the computer-readable storage medium provided in the eighth aspect, the computer program product provided in the ninth aspect, and the chip provided in the tenth aspect are all used to execute the method provided in the embodiment of the present application. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding method, which will not be repeated here.
  • FIG1 is a hierarchical relationship diagram of a namespace provided in an embodiment of the present application.
  • FIG2 is a schematic diagram of a detection process provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of a detection system provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of a scenario of container escape provided by an embodiment of the present application.
  • FIG5 is a schematic diagram of another scenario of container escape provided by an embodiment of the present application.
  • FIG6 is a flow chart of a method for detecting container escape provided in an embodiment of the present application.
  • FIG7 is a flow chart of another method for detecting container escape provided in an embodiment of the present application.
  • FIG8A is a flow chart of a method for detecting container escape provided in an embodiment of the present application.
  • FIG8B is a flow chart of another method for detecting container escape provided in an embodiment of the present application.
  • FIG9A is a flow chart of a method for detecting container escape provided in an embodiment of the present application.
  • 9B is a flow chart of another method for detecting container escape provided in an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a container escape detection device 100 provided in an embodiment of the present application.
  • FIG. 11 is a schematic structural diagram of a container escape detection device 110 provided in an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of another container escape detection device 120 provided in an embodiment of the present application.
  • first and second are used for descriptive purposes only and are not to be understood as suggesting or implying relative importance or implicitly indicating the number of the indicated technical features.
  • a feature defined as “first” or “second” may explicitly or implicitly include one or more of the features, and in the description of the embodiments of the present application, unless otherwise specified, "plurality” means two or more.
  • Container An instance created based on an image.
  • the instance is an object that includes the user configuration and running configuration required to implement the functions of the image.
  • a container is a running instance created by an image. It can be started, started, stopped, and deleted. Each container is isolated from each other and ensures security. The container provides an isolated environment for running programs, and the container strictly controls the resources that the programs in it can access.
  • Containers provide isolated operating spaces for applications: each container contains a complete user environment space that is exclusive to it, and changes in one container will not affect the operating environment of other containers.
  • container technology uses a series of system-level mechanisms such as using Linux namespaces to isolate spaces, using file system mount points to determine which files containers can access, and using cgroups to determine how many resources each container can use.
  • containers share the same system kernel, so when the same library is used by multiple containers, memory usage efficiency will be improved.
  • Container escape refers to the following process and results: First, the attacker has obtained the ability to execute commands under certain permissions in the container by hijacking the containerized business logic or directly controlling it (CaaS and other scenarios where container control is legally obtained). The attacker uses this command execution capability to further obtain the ability to execute commands under certain permissions on the direct host machine where the container is located (it is common to see the scenario of "physical machines running virtual machines, and virtual machines running containers”. The direct host machine in this scenario refers to the virtual machine outside the container).
  • Namespace is mainly used for resource isolation. Resources in different namespaces are not visible. The main resources implemented by namespace are: IPC/NetWork/Mount/PID/User/UTS/Cgroup.
  • namespace provides an abstraction of global resources. Resources are placed in different namespaces. The resources in each namespace are isolated from each other and are called by different processes. Resources are placed in different containers (different namespaces), and each container is isolated from each other.
  • the PID number is the kernel's unique identifier for distinguishing each process.
  • the PID number is a number assigned to a process in Linux to uniquely identify it in its namespace. It is called the process ID number, or PID for short.
  • the process generated will be assigned a new unique PID value by the kernel.
  • the level of the namespace where the process is located represents the level of the current namespace.
  • the initial namespace has a level of 0, and its sub-namespace has a level of 1, and so on.
  • the sub-namespace is visible to the parent namespace. From a given level setting, the kernel can infer how many IDs the process will be associated with.
  • Namespace also has the concept of hierarchy. Level indicates the hierarchy of different namespaces. Level indicates which layer the namespace is in. When creating a child process through the clone function or fork function, you can specify whether to create a new namespce. If not specified, the namespce of the parent process is integrated by default; otherwise, a new namespce is created and the level in task_struct is increased by 1.
  • a high-level namespace can be seen by a low-level namespace.
  • a high-level process has multiple PID numbers. For example, the system defaults to namespace mirroring level0, and a new namespace called level1 is created under level0, and a process is run in level. The PID number of this process in level is 1. Because the higher-level PID namespace needs to be seen by the lower-level PID namespace, this process is PID xxx in level0, and the PID number xxx is assigned according to the PID sequence in level0.
  • FIG. 1 exemplarily shows a hierarchical relationship diagram of a namespace.
  • each parent namespace derives two child namespaces; the parent namespace has a level 0 (i.e., level 0); the child namespace has a level 1 (i.e., level 1).
  • level 0 i.e., level 0
  • level 1 i.e., level 1
  • each namespace can have a process with PID number 1; but due to the hierarchical nature of the namespace, the parent namespace knows the existence of the child namespace, and the child namespace must be mapped to the parent namespace. Therefore, the six processes of the two child namespaces in level 1 in Figure 1 are mapped to PID numbers 5 to 10 of their parent namespaces respectively.
  • a process is a program or command being executed. Each process is a running entity, has its own address space, and occupies certain system resources. Once a program is running, it is a process.
  • a process can be seen as an instance of program execution.
  • a process is an independent entity allocated by system resources, and each process has an independent address space.
  • a process cannot access the variables and data structures of another process. If you want a process to access the resources of another process, you need to use inter-process communication, such as pipes, files, sockets, etc.
  • a process is a dynamic entity consisting of a text segment, a user data segment, and a system data segment.
  • the system data segment stores the control information of the process, including the process control block (PCB).
  • PCB process control block
  • PCB is a data structure used to describe and control the operation of a process. It is part of the process entity and the most important record-type data structure in the operating system. Generally, PCB contains the following:
  • Process identifier used to uniquely identify a process
  • Processor information general registers, instruction counter, program status word PSW, user stack pointer
  • Process scheduling information (process status, process priority, other information required for process scheduling, events);
  • Process control information address of program data, resource list, process synchronization and communication mechanism, link pointer
  • the content defined in the data structure provides support for subsequent management, so different operating systems have made some adjustments to the content of the PCB according to their own characteristics. Different operating systems have different PCB structures.
  • the Linux process control block is a data structure defined by the structure task_struct, which includes various information required to manage the process. All process control blocks of the Linux system are organized into a structure array.
  • task_struct is the unique identifier of the process and the core of the Linux process entity.
  • the Linux kernel uses the task_struct data structure to associate all process-related data and structures. All algorithms in the Linux kernel involving processes and programs are built around this data structure, which is one of the most important data structures in the kernel.
  • Mount namespace is used to isolate the mount points of the file system, so that different mount namespaces have their own independent mount point information, and different namespaces will not affect each other.
  • Mounting usually mounting a storage device to an existing directory. Accessing this directory means accessing the contents of the storage device.
  • All files are placed in a tree-like directory structure starting from the root directory.
  • Any hardware device is also in the form of a file. For example, if Linux wants to use a USB flash drive hardware device, it must combine the Linux directory and the hardware device's file directory into one. This process is called mounting.
  • Mount point The mount operation will hide the files in the original Linux directory, so choose the Linux directory itself. It is best to create a new empty operation directory for mounting. After mounting, this operation directory is called a mount point.
  • the root directory of the Linux system (/): The file system of Linux and UNIX is a hierarchical tree file structure with "/" as the root, so "/" is called the root directory. Every file and directory starts from here.
  • This directory is different from the /root directory, which is the home directory of the root user.
  • the root directory “/” is located at the top level of the file system directory structure. It is the top-level directory. All files and directories are placed under the root directory “/”; under the root directory “/” there are also subdirectories such as “/bin”, “/home”, and “/usr”.
  • the system manager authorizes each process to use a given user identifier (UID).
  • UID user identifier
  • Each started process has a UID of the user who started it.
  • a child process has the same UID as the parent process.
  • a user can be a member of a group, and each group also has a group identifier (GID).
  • GID group identifier
  • User identifier UID An integer used by the system to distinguish different users.
  • Linux uses a 32-bit integer to record and distinguish different users. This number that distinguishes different users is called User ID, or UID for short. Users in the Linux system are divided into three categories, namely ordinary users, root users, and system users.
  • the root user is also called the administrator account, and its UID is usually 0.
  • the root user can manage ordinary users and the entire system.
  • Ordinary users refer to all real users who use the Linux system.
  • the UID of ordinary users can be specified by the administrator when they are created. If not specified, the UID of ordinary users can be greater than 500; the user's UID can also be numbered sequentially from 1000 to 60000 by default.
  • System users refer to users who are required for the system to run, but are not real users. In other words, system users are automatically created during the installation process and do not have the ability to log in to the system.
  • User group identifier GID An integer used by the system to distinguish different user groups.
  • GID Group ID
  • UID and GID are managed by the Linux kernel and are used by kernel-level system calls to determine whether a request should be granted privileges. For example, when a process attempts to write to a file, the kernel checks the UID and GID of the creating process to determine whether it has sufficient permissions to modify the file.
  • (VI) Slab is a memory allocation mechanism of the Linux operating system.
  • the slab allocation algorithm uses cache to store kernel objects.
  • the slab allocator is managed based on objects. Objects of the same type are classified into one category (such as process descriptors). Whenever such an object is requested, the slab allocator allocates a unit of this size from a slab list, and when it is released, it is saved in the list again instead of being returned directly to the buddy system, thus avoiding internal fragmentation. The slab allocator does not discard the allocated objects, but releases and stores them in memory. When a new object is requested in the future, it can be obtained directly from the slab without repeated initialization.
  • each slab consists of one or more consecutive page frames, which contain both allocated objects and free objects.
  • the slab allocation algorithm uses a cache to store kernel objects.
  • a cache When a cache is created, it initially contains a number of objects marked as free. The number of objects depends on the size of the slab. Initially, all objects are marked as free.
  • an object of a kernel data structure When an object of a kernel data structure is needed, it can be directly obtained from the cache and the object is initialized for use.
  • struct task_struct is about 1.7KB in size.
  • the Linux kernel creates a new task, it obtains the memory required for the struct task_struct object from the cache. There will be a struct task_struct object allocated and marked as free in the cache to satisfy the request.
  • Linux slabs can have three states:
  • Partial Some objects in the slab are marked as used, and some are marked as free.
  • the slab allocator first allocates from some free slabs. If there are, it allocates from the empty slab. If not, it allocates a new slab from the physical continuous page, assigns it to a cache, and then allocates space from the new slab.
  • each process includes a process control block PBC in the kernel, which contains all the information of the process, that is, a data structure (task_strcut).
  • the task_strcut of a process includes the following information:
  • Process ID including PID/UID or GID
  • Mount point identifier indicates the file system fs that the process can operate, including reading and writing files
  • the namespace identifier of the process (nsproxy): used to identify the namespace to which the process belongs
  • the Linux kernel provides multiple namespaces such as PID namespace, a process may belong to multiple namespaces.
  • the kernel introduces nsproxy to uniformly manage the namespaces to which processes belong.
  • nsproxy stores a set of pointers to various namespace types, acting as a proxy for processes to access various namespaces. Since multiple processes may have exactly the same namespace, nsproxy can be shared between processes. The count field in nsproxy is responsible for recording the number of references to the structure.
  • Initial namespace identifier init_nsproxy The system predefines an init_nsproxy, which is used as the default nsproxy.
  • init_nsproxy defines the initial global namespace, which stores pointers to the initial namespace objects of each subsystem and has higher permissions.
  • an attacker can exploit certain kernel vulnerabilities to tamper with the nsproxy in the process task_struct into init_nsproxy, thereby achieving privilege escalation.
  • a preliminary detection mechanism was released to the public. It detects the escape to the init namespace by detecting whether the nsproxy of the process is init_nsproxy, whether the current process is a root process, and whether the level of the current process is 0.
  • the root process refers to a process running with root privileges. The specific detection process can be seen in Figure 2.
  • the host machine can first obtain the task_strcut of the current process, and obtain nsproxy from the task_strcut; when nsproxy is not initnsproxy, end the detection; when nsproxy is initnsproxy, check whether the UID or GID of the current process is the UID or GID of the root user; when the UID or GID of the current process is not the UID or GID of the root user, end the detection; when the UID or GID of the current process is the UID or GID of the root user, check whether the level of the namespace of the current process is 0; when the level of the namespace of the current process is 0, end the detection; when the level of the UID or GID of the current process is not 0, it is determined that the process has a container escape, and a warning message (i.e., an alarm) can be prompted.
  • a warning message i.e., an alarm
  • the detection principle of Figure 2 is that it is assumed that the nsproxy of the process is init nsproxy, and the process is a root process with a level of 0, indicating that it is a root process with a level of 0. Since the level is 0, the namespace of all child nodes can be operated, but the namaspce of the mount point in the nsproxy data structure is separate, so here all namespaces except fs are obtained.
  • level indicates the level of different namespaces.
  • Level indicates which layer the namespace is in.
  • the method shown in Figure 2 cannot deal with container escapes caused by other kernel vulnerabilities. For example, if an attacker changes the UID and GID in the task_struct structure to the UID and GID of the root user and changes the level of the current process to 0, the detection mechanism shown in Figure 2 can be bypassed; for another example, when an attacker exploits a kernel vulnerability to tamper with the operating directory (i.e., the mount point) of the attack process to the root directory, the detection mechanism shown in Figure 2 can be bypassed to achieve the behavior of obtaining the namespace of fs.
  • the operating directory i.e., the mount point
  • Figure 3 is a schematic diagram of a detection system provided in an embodiment of the present application.
  • the system includes a physical machine and one or more virtual machines (VMs) running on an operating system of the physical machine (only virtual machine 1 , virtual machine 2 , and virtual machine 3 are shown in FIG3 ).
  • VMs virtual machines
  • the physical machine is responsible for the management and allocation of hardware resources, and presents a virtual hardware platform to the virtual machine, for example, providing the virtual machine with a virtual CPU, memory, virtual disk, virtual network card, etc.
  • One or more containers can be created in a virtual machine.
  • FIG3 exemplarily shows two containers in virtual machine 1, container 1 and container 2; two containers in virtual machine 2, container 3 and container 4; and two containers in virtual machine 3, container 5 and container 6.
  • Virtual machines can use containers to provide relatively independent and isolated operating environments for processes. For example, container 1 supports the operation of process 1, and container 2 supports the operation of process 2.
  • Virtual Machine refers to one or more virtual computers simulated on a physical computer. These virtual machines can work like real physical computers.
  • Process It is an entity that executes instructions. A process can be used to run a program to execute various instructions.
  • Container used to provide a relatively independent and isolated operating environment for a process.
  • a container includes an independent file system, namespace, resource view, etc.
  • Container instance After a process runs in the environment provided by a container, the container can be called a container instance.
  • the physical machine may include: a central processing unit (CPU), memory, hard disk, motherboard, and 3D processing graphics card, etc., and based on these hardware, the physical machine may include a virtual machine manager (VMM) module and at least one virtual machine, and the VMM and VM are software modules in the physical machine, wherein:
  • the CPU is used to execute various logical calls; the VMM is used to create at least one virtual machine and virtualize the physical resources in the physical machine into multiple virtual resources for use by the virtual machines; and each virtual machine has independent storage and computing units, and the functions and structures of each virtual machine are similar.
  • the container escape detection method provided in the embodiment of the present application is executed by a host machine, which may be the above-mentioned physical machine or the above-mentioned virtual machine.
  • the detection system shown in FIG. 3 is only an exemplary implementation of the embodiment of the present application, and the system architecture in the embodiment of the present application includes but is not limited to the above architecture.
  • privileged containers refer to some namespaces with higher permissions, such as init_nsproxy.
  • the method for detecting container escape provided in the embodiment of the present application can detect that a process in the following three scenarios has a container escape.
  • the detection program display message, dmesg
  • the detection shown in Figure 2 can be successfully bypassed; then, the attacker can obtain other namespaces except the mount namespace through the process.
  • the container escape can be detected by the container escape detection method shown in FIG. 6 or FIG. 7 .
  • an attacker can exploit a kernel vulnerability to modify the user ID or GID of a process to 0, and modify the mount point (tsk->fs) to the root directory (init_task->fs), thus successfully bypassing the detection shown in Figure 2. Then, the attacker can obtain the namespace of the mounted file system through the process. For example, as shown in Figure 4, assuming that the process before modification can only obtain the contents of the rop directory, when the attacker modifies its UID or GID to 0 and changes the mount point to the root directory, the modified process can obtain the mounted file system fs, that is, the namespace of fs.
  • the container escape can be detected by the container escape detection method shown in Figures 8A to 9B.
  • the modified process can obtain the root user's permissions; further, obtain the host name (hostname) of the host machine; obtain the host machine's network card information; and obtain the host machine's file mount.
  • the container escape can be detected by the container escape detection method shown in FIG. 6 or FIG. 7 .
  • Figure 6 is a method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S101 The host machine detects whether the namespace where the process is located is an initial namespace.
  • the host machine may start to detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy (copy namespace) related operations, that is, start to execute step S101. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • nsproxy stores a set of pointers to various types of namespaces, acting as a proxy for processes to access various namespaces. Since multiple processes may have exactly the same namespace, nsproxy can be shared between processes; init_nsproxy stores pointers to the initial namespace objects of each subsystem and has higher permissions. The process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application can execute step S102 to continue to detect the process.
  • S102 The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • the host executes step S104 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located; and executes step S103 when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located.
  • S103 The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine.
  • the embodiment of the present application does not limit the method of prompting warning information by the host machine.
  • S104 The host machine ends the detection.
  • the host machine ends the detection when it determines that the namespace where the process is located is not the initial namespace. It should be noted that if the nsproxy of the process is init_nsproxy, the process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application needs to detect the process; if the nsproxy of the process is not init_nsproxy, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
  • Figure 7 is a flow chart of another method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S201 The host machine obtains the data structure of the process.
  • the data structure may include an identifier nsproxy of the namespace where the process is located, a user identifier UID or GID, and a level of the namespace where the process is located.
  • the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts executing step S201. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • task_struct is the process control block PCB under Linux, which contains all the information of a process, including the UID of the process.
  • PCB process control blocks
  • the Linux process control block is a data structure defined by the task_struct structure, which includes various information needed to manage the process, such as the nsproxy mentioned above.
  • S202 The host machine detects whether the namespace where the process is located is an initial namespace.
  • S203 The host machine detects whether the user identifier is the root user's identifier.
  • the user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
  • the host executes step S204; when it is determined that the user identifier is not an identifier of the root user, the host executes step S207.
  • S204 The host machine detects whether the level of the namespace where the process is located is zero.
  • the host machine detects whether the process level is zero; when the level of the namespace where the process is located is zero, step S205 is executed; when it is determined that the level of the namespace where the process is located is not zero, step S206 is executed.
  • the level of the namespace where the process is located represents the level of the current namespace.
  • the initial namespace has a level of 0, and its sub-namespace has a level of 1, and they increase in sequence.
  • the sub-namespace is visible to the parent namespace. From a given level setting, the kernel can infer how many IDs the process will be associated with.
  • S205 The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • page is the virtual address space of the object
  • slab_cache is the slab manager pointed to by the current page, and subsequent memory allocations are all based on this cache
  • pid_cachep is the address of the slab that points to the allocated pid.
  • step S207 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, the host executes step S207; when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, the host executes step S206.
  • S205 is a detection for slab_cache.
  • the principle is (1) the no merge attribute is set when the namespace slab_cache is created, and caches of similar sizes are not allowed to merge; (2) when the process pid_cachep is allocated, a new slab cache is directly created; (the pid_cachep of each process is different).
  • merge is a feature of the slab allocator.
  • S206 The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host machine determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, it determines that the process has been detected to have escaped; then, an alarm message is prompted.
  • the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine.
  • the embodiment of the present application does not limit the method of prompting warning information by the host machine.
  • S207 The host machine ends the detection.
  • the host machine ends the detection when it determines that the namespace where the process is located is not the initial namespace. It should be noted that if the nsproxy of the process is init_nsproxy, the process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application needs to detect the process; if the nsproxy of the process is not init_nsproxy, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
  • the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
  • Figure 8A is a flow chart of a method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S301 The host machine detects whether the mount point of the process is a root directory.
  • the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S301. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • the mount point is an operation directory. It should be noted that the mount operation will hide the files in the original Linux directory, so it is best to select the Linux directory itself and create an empty operation directory for mounting. After mounting, this operation directory is called a mount point.
  • the mount point identifier is used to indicate the file system fs that the process can operate, including reading and writing files.
  • S301 is a bypass detection for init_fs, that is, adding an fs-based detection mechanism in the kernel code to detect the escape behavior of escaping to the root directory.
  • S302 The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • the host executes step S304 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located; and executes step S303 when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located.
  • S303 The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine.
  • the embodiment of the present application does not limit the method of prompting warning information by the host machine.
  • S304 The host machine ends the detection.
  • the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection can be ended. Process detection.
  • FIG. 8A The method embodiment shown in FIG. 8A above includes many possible implementation schemes. Some of the implementation schemes are illustrated below in conjunction with FIG. 8B. It should be noted that related concepts, operations or logical relationships not explained in FIG. 8B can refer to the corresponding descriptions in the embodiment shown in FIG. 8A.
  • Figure 8B is a flow chart of another method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S401 The host machine obtains the data structure of the process.
  • the data structure may include a mount point identifier of the process, a user identifier UID or GID, and a level of the namespace where the process is located.
  • the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S401. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • S402 The host machine detects whether the mount point of the process is a root directory.
  • S403 The host machine detects whether the user identifier is the root user identifier.
  • the host machine checks whether the user identifier is the root user's identifier, that is, checks the UID/GID of the process to see whether the current process is a ROOT process.
  • the user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
  • the host executes step S404; when it is determined that the user identifier is not an identifier of the root user, the host executes step S407.
  • S404 The host machine detects whether the level of the namespace where the process is located is zero.
  • the host machine detects whether the process level is zero, that is, checks whether the nsproxy level of the process is the level (0) of init_proxy; when the level of the namespace where the process is located is zero, execute step S405; when it is determined that the level of the namespace where the process is located is not zero, execute step S406.
  • S405 The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
  • page is the virtual address space of the object
  • slab_cache is the slab manager pointed to by the current page, and subsequent memory allocations are all based on this cache
  • pid_cachep is the address of the slab that points to the allocated pid.
  • step S407 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, the host executes step S407; when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, the host executes step S406.
  • S405 is a detection for slab_cache.
  • the principle is (1) the no merge attribute is set when the namespace slab_cache is created, and caches of similar sizes are not allowed to merge; (2) when the process pid_cachep is allocated, a new slab cache is directly created; (the pid_cachep of each process is different).
  • merge is a feature of the slab allocator.
  • S406 The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host can also use other methods to prompt warning information, such as recording the escape of the current process through the kernel dmesg
  • the embodiment of the present application does not limit the manner in which the host prompts the warning information.
  • the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
  • the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
  • the host machine can simultaneously execute the methods of Figures 7 and 8A, that is, after obtaining the data structure of the process, the host machine can execute steps S202 and S402, and when the process satisfies the condition that the namespace of the process is the initial namespace and the mount point of the process is any one of the root directories, execute steps S203 to S207 or S403 to S407.
  • the above method includes a detection mechanism for detecting namespce that escapes to fs and detecting slab_cache, which can detect the behavior of obtaining all namecpaces due to escaping due to modifying the namespace.
  • Figure 9A is a flow chart of a method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S501 The host machine detects whether the mount point of the process is a root directory.
  • the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S501. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • the mount point is an operation directory. It should be noted that the mount operation will hide the files in the original Linux directory, so it is best to select the Linux directory itself and create an empty operation directory for mounting. After mounting, this operation directory is called a mount point.
  • the mount point identifier is used to indicate the file system fs that the process can operate, including reading and writing files.
  • S501 is a bypass detection for init_fs, that is, adding an fs-based detection mechanism in the kernel code to detect the escape behavior of escaping to the root directory.
  • S502 The host machine detects whether the target data in the data structure of the process meets the preset conditions.
  • the host machine may obtain target data from the data structure of the process; and then detect whether the target data meets a preset condition.
  • the target data is the level of the namespace where the process is located; the host determines whether the level of the namespace where the process is located is zero; when it is determined that the level of the namespace where the process is located is zero, it is determined that the target data in the data structure of the process meets the preset conditions, and step S503 is executed. It should be noted that since the level of the namespace where the process is located is zero, it proves that the process has a high authority, so an alarm is issued.
  • the target data is the user ID of the process; the host machine can detect whether the user ID of the process is the ID of the root user; when it is determined that the user ID of the process is the ID of the root user, it is determined that the target data in the data structure of the process meets the preset conditions, and step S503 is executed. It should be noted that since the user ID of the process is the ID of the root user, it proves that the process has higher authority, so an alarm is issued.
  • the user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
  • step S504 when the host machine determines that the target data in the data structure of the process meets the preset condition, step S504 is executed; otherwise, step S503 is executed.
  • S503 The host machine prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine.
  • the embodiment of the present application does not limit the method of prompting warning information by the host machine.
  • the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
  • FIG. 9A The method embodiment shown in FIG. 9A above includes many possible implementation schemes. Some of the implementation schemes are illustrated below in conjunction with FIG. 9B . It should be noted that related concepts, operations or logical relationships not explained in FIG. 9B can refer to the corresponding descriptions in the embodiment shown in FIG. 9A .
  • Figure 9B is a flow chart of another method for detecting container escape provided by an embodiment of the present application.
  • the method may include some or all of the following steps:
  • S601 The host machine obtains the data structure of the process.
  • the data structure may include a mount point identifier of the process, a user identifier UID or GID, and a level of the namespace where the process is located.
  • the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S601. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
  • S602 The host machine detects whether the mount point of the process is a root directory.
  • S603 The host machine detects whether the user identifier is the root user's identifier.
  • the user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
  • the host executes step S604; when it is determined that the user identifier is not an identifier of the root user, the host executes step S606.
  • S604 The host machine detects whether the level of the namespace where the process is located is zero.
  • the host machine detects whether the process level is zero; when the level of the namespace where the process is located is not zero, step S605 is executed; when it is determined that the level of the namespace where the process is located is zero, step S606 is executed.
  • S605 The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
  • the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process.
  • the warning message may include information about the current process.
  • the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine.
  • the embodiment of the present application does not limit the method of prompting warning information by the host machine.
  • S606 The host machine ends the detection.
  • the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
  • the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
  • FIG. 10 is a schematic diagram of the structure of a container escape detection device 100 provided in an embodiment of the present application. It may include an acquisition unit 1001 and a prompt unit 1002, and may also include a determination unit 1003.
  • the container escape detection device 100 is used to implement the above-mentioned container escape detection method, such as the container escape detection method of any embodiment shown in FIG. 6 or FIG. 7 .
  • the container escape detection device 100 comprises:
  • the acquisition unit 1001 is used to acquire the address of the cache area to which the process belongs when it is determined that the namespace where the process is located is the initial namespace;
  • the prompt unit 1002 is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
  • the namespace where the process is located is the initial namespace, that is, the namespace that the process can indicate by init_nsproxy, in order to ensure the security of the namespace; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, and the container escape occurs.
  • This method has high accuracy and can effectively detect container escapes.
  • the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the acquiring unit 1001 is used to:
  • the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
  • An acquisition unit 1001 is used to acquire a user identifier of the process when it is determined that the namespace where the process is located is an initial namespace;
  • the acquisition unit 1001 is used to acquire the address of the cache area to which the process belongs when it is determined that the user identifier is the identifier of the root user.
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; the identifier of the root user is zero.
  • the acquiring unit 1001 is configured to:
  • the level of the namespace where the process is located is obtained
  • the device further includes a determining unit 1003,
  • the acquisition unit 1001 is used to acquire the data structure of the process; the data structure includes the identifier of the namespace where the process is located;
  • the determining unit 1003 is configured to determine that the namespace where the process is located is the initial namespace when the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace.
  • each unit may also correspond to the corresponding description of the embodiment shown in FIG. 6 or FIG. 7 .
  • each unit corresponds to its own program code (or program instruction), and when the program codes corresponding to each of these units are run on the processor, the corresponding process of the unit is implemented to achieve the corresponding function.
  • FIG 11 is a schematic diagram of the structure of a container escape detection device 110 provided in an embodiment of the present application.
  • the device 110 may include an acquisition unit 1101 and a prompt unit 1102, and may also include a determination unit 1103.
  • the container escape detection device 110 is used to implement the aforementioned container escape detection method, such as the container escape detection method of any one of the embodiments shown in Figure 8A or Figure 8B.
  • the container escape detection device 110 comprises:
  • the acquisition unit 1101 is used to acquire the address of the cache area to which the process belongs when it is determined that the mount point of the process is the root directory;
  • the prompt unit 1102 is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
  • the mount point of the process is the root directory, which means that the process can access all files under the root directory.
  • the process needs to be detected; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since the address of the cache area to which the process belongs is more credible, the address of the cache area to which the process belongs is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it is considered that the process originally did not have the authority to access all files under the root directory, and the process has been maliciously tampered with, resulting in a container escape.
  • the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
  • the acquiring unit 1101 is used to:
  • the acquiring unit 1101 is used to:
  • the user identifier is at least one of a user identifier UID and a user group identifier GID; the identifier of the root user is zero.
  • the acquiring unit 1101 is used to:
  • the level of the namespace where the process is located is obtained
  • the device further includes a determining unit 1103,
  • the acquisition unit 1101 is used to acquire the data structure of the process; the data structure includes the mount point identifier of the process;
  • the determining unit 1103 is configured to determine that the mount point of the process is a root directory when the mount point identifier is a root directory identifier.
  • each unit may also correspond to the corresponding description of the embodiment shown in FIG. 8A or FIG. 8B .
  • each unit corresponds to its own program code (or program instruction), and when the program codes corresponding to each of these units are run on the processor, the corresponding process of the unit is implemented to achieve the corresponding function.
  • FIG 12 is a schematic diagram of the structure of another container escape detection device 120 provided in an embodiment of the present application.
  • the device 120 includes at least one processor 1201, at least one memory 1202, and at least one communication interface 1203.
  • the processor 1201, the memory 1202, and the communication interface 1203 are connected through the communication bus and communicate with each other.
  • Processor 1201 can be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the above program.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • Communication interface 1203 is used to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
  • RAN radio access network
  • WLAN Wireless Local Area Networks
  • the memory 1202 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM) or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), a compact disc (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical disc, laser disc, optical disc, digital versatile disc, Blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and can be accessed by a computer, but is not limited thereto.
  • the memory may exist independently and be connected to the processor through a bus. The memory may also be integrated with the processor.
  • the memory 1202 is used to store application code for executing the above solution, and the execution is controlled by the processor 1201.
  • the processor 1201 is used to execute the application code stored in the memory 1202.
  • the code stored in the memory 1202 may execute any of the container escape detection methods provided above, such as:
  • the address of the cache area to which the process belongs is obtained; when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, an alarm message is prompted, and the alarm message is used to prompt the process that a container escape occurs.
  • An embodiment of the present application also provides an electronic device, which includes one or more processors and one or more memories; wherein the one or more memories are coupled to the one or more processors, and the one or more memories are used to store computer program codes, and the computer program codes include computer instructions, and when the one or more processors execute the computer instructions, the electronic device executes the method described in the above embodiment.
  • the present application also provides a computer program product including instructions.
  • the computer program product is run on an electronic device, The electronic device is enabled to execute the method described in the above embodiment.
  • An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and when the program is executed, the program includes part or all of the steps of any one of the container escape detection devices recorded in the above method embodiments.
  • the disclosed device can be implemented in other ways.
  • the device embodiments described above are only schematic, such as the division of the units, which is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed.
  • Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, and the indirect coupling or communication connection of the device or unit can be electrical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable memory.
  • the technical solution of the present application, or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a memory and includes several instructions for a computer device (which can be a personal computer, server or network device, etc.) to execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application provides a container escape detection method, an electronic device, and a system. The method comprises: when determining that a namespace where a process is located is an initial namespace, a host machine acquires the address of a cache region to which the process belongs; and when determining that the address of the cache region to which the process belongs is not equivalent to the address in the namespace where the process is located, the host machine prompts alarm information, the alarm information being used for prompting the occurrence of container escape in the process. By implementing embodiments of the present application, the accuracy is high, and container escape can be effectively detected.

Description

一种容器逃逸的检测方法、电子设备及系统Container escape detection method, electronic device and system
本申请要求在2022年09月29日提交中国国家知识产权局、申请号为202211200843.7的中国专利申请的优先权,发明名称为“一种容器逃逸的检测方法、电子设备及系统”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the State Intellectual Property Office of China on September 29, 2022, with application number 202211200843.7, and the priority of the Chinese patent application entitled "A method, electronic device and system for detecting container escape", all contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请实施例涉及虚拟化技术,尤其涉及一种容器逃逸的检测方法、电子设备及系统。Embodiments of the present application relate to virtualization technology, and more particularly to a method, electronic device, and system for detecting container escape.
背景技术Background technique
容器技术是将应用程序打包到单独的容器之中,进行封装的技术,容器技术将每个应用程序隔离开,打断了程序之间的依赖和连接关系,也就是说,一个庞大的服务系统在容器技术的支持下,可以由许多不同的应用程序所寄居的容器组合而成。容器技术有效地将由单个操作系统管理的资源划分到隔离的组中,以更好地在隔离的组之间平衡有冲突的资源使用需求,属于操作系统级的虚拟化技术,由于其轻量级的特性,被广泛应用。Container technology is a technology that packages applications into separate containers. It isolates each application and breaks the dependencies and connections between programs. In other words, with the support of container technology, a large service system can be composed of containers hosted by many different applications. Container technology effectively divides resources managed by a single operating system into isolated groups to better balance conflicting resource usage requirements between isolated groups. It is an operating system-level virtualization technology and is widely used due to its lightweight characteristics.
用户可以在宿主机中基于容器镜像创建并运行容器,其中,宿主机可以为物理机或虚拟机,每一个容器都拥有一个独立的进程运行空间,理想情况下,容器中的进程只能在该容器的进程运行空间中运行。然而,当容器中存在有恶意进程时,该恶意进程很可能会脱离容器的进程运行空间,继而对宿主机或其他容器展开攻击,这种现象被称为容器逃逸。Users can create and run containers based on container images in a host machine, where the host machine can be a physical machine or a virtual machine. Each container has an independent process running space. Ideally, processes in a container can only run in the process running space of the container. However, when there is a malicious process in the container, the malicious process is likely to escape from the process running space of the container and then attack the host machine or other containers. This phenomenon is called container escape.
目前,容器逃逸可以分为以下4种类型,(1)不安全的配置导致的容器逃逸;(2)不安全的挂载导致的容器逃逸;(3)相关程序漏洞导致的容器逃逸;(4)内核漏洞导致的容器逃逸。如何检测基于内核漏洞导致的容器逃逸,是业界亟需解决的问题。Currently, container escapes can be divided into the following four types: (1) container escapes caused by insecure configuration; (2) container escapes caused by insecure mounting; (3) container escapes caused by related program vulnerabilities; and (4) container escapes caused by kernel vulnerabilities. How to detect container escapes caused by kernel vulnerabilities is an urgent problem that the industry needs to solve.
发明内容Summary of the invention
本申请提供了一种容器逃逸的检测方法、电子设备及系统,该容器逃逸的检测方法中,宿主机可以在进程所属的缓存区的地址不等于进程所在的命名空间中的地址,提示告警信息,告警信息用于提示进程发生容器逃逸。该方法准确性高,可以有效的检测容器逃逸。The present application provides a method, electronic device and system for detecting container escape. In the method for detecting container escape, the host machine can prompt an alarm message when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located. The alarm message is used to prompt the process that the container escape has occurred. The method has high accuracy and can effectively detect container escape.
第一方面,本申请实施例提供了一种容器逃逸的检测方法,该方法包括:In a first aspect, an embodiment of the present application provides a method for detecting container escape, the method comprising:
宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址;When the host determines that the namespace where the process is located is the initial namespace, it obtains the address of the cache area to which the process belongs;
宿主机在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。When the host determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, an alarm message is prompted, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程所在的命名空间为初始命名空间,即是代表该进程可以访问init_nsproxy指示的命名空间,为保证命名空间的安全;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于slab_cache可信度较高,因此,将slab_cache与进程所在的命名空间中的地址pid_cache进行比较;在两者不等时,可以确定该进程并没有权限访问init_nsproxy指示的命名空间,也就是说该进程被恶意篡改,出现容器逃逸的情况。该方法的准确性高,可以有效的检测容器逃逸。In the embodiment of the present application, the namespace where the process is located is the initial namespace, which means that the process can access the namespace indicated by init_nsproxy, in order to ensure the security of the namespace; therefore, the host can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, resulting in a container escape. This method has high accuracy and can effectively detect container escapes.
结合第一方面,在一种可能的实现方式中,方法包括:In combination with the first aspect, in a possible implementation manner, the method includes:
宿主机基于进程的地址空间,获取进程对应的缓存区的地址。The host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
本申请实施例中,进程的地址空间pid_cachep不容易被篡改,可信度较高,因此,基于进程的地址空间pid_cachep确定的进程对应的缓存区的地址slab_cache的可信度高。In the embodiment of the present application, the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
结合第一方面,在一种可能的实现方式中,宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址,包括:In combination with the first aspect, in a possible implementation manner, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs includes:
宿主机在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;When the host machine determines that the namespace where the process is located is the initial namespace, it obtains the user ID of the process;
宿主机在确定用户标识为根用户的标识时,获取进程所属的缓存区的地址。When the host machine determines that the user identifier is the identifier of the root user, it obtains the address of the cache area to which the process belongs.
结合第一方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the first aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
结合第一方面,在一种可能的实现方式中,宿主机在确定进程所在的命名空间为初始命名空间时, 获取进程所属的缓存区的地址,包括:In combination with the first aspect, in a possible implementation, when the host machine determines that the namespace where the process is located is the initial namespace, Get the address of the cache area to which the process belongs, including:
宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级level;When the host determines that the namespace where the process is located is the initial namespace, it obtains the level of the namespace where the process is located;
宿主机在确定层级为零时,获取进程所属的缓存区的地址。When the host determines that the level is zero, it obtains the address of the cache area to which the process belongs.
结合第一方面,在一种可能的实现方式中,在获取进程所属的缓存区的地址之前,方法还包括:In combination with the first aspect, in a possible implementation manner, before obtaining the address of the cache area to which the process belongs, the method further includes:
宿主机获取进程的数据结构;数据结构包括进程所在的命名空间的标识;The host obtains the data structure of the process; the data structure includes the identifier of the namespace where the process is located;
宿主机确定在进程所在的命名空间的标识nsproxy等于初始命名空间的标识init_nsproxy时,确定进程所在的命名空间是初始命名空间。When the host determines that the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace, it determines that the namespace where the process is located is the initial namespace.
第二方面,本申请实施例提供了一种容器逃逸的检测方法,该方法包括:In a second aspect, an embodiment of the present application provides a method for detecting container escape, the method comprising:
宿主机在确定进程的挂载点为根目录时,获取进程所属的缓存区的地址;When the host determines that the mount point of the process is the root directory, it obtains the address of the cache area to which the process belongs;
宿主机在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。When the host determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, an alarm message is prompted, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程的挂载点为根目录,也即是代表该进程可以访问根目录下的所有文件,为保证根目录下的文件的安全,需要对该进程进行检测;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于进程所属的缓存区的地址可信度较高,因此,将进程所属的缓存区的地址与进程所在的命名空间中的地址pid_cache进行比较,在两者不等时认为该进程原先并没有权限访问根目录下的所有文件,该进程已被恶意篡改,出现容器逃逸的情况。In the embodiment of the present application, the mount point of the process is the root directory, which means that the process can access all files under the root directory. In order to ensure the security of the files under the root directory, the process needs to be detected; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since the address of the cache area to which the process belongs is more credible, the address of the cache area to which the process belongs is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it is considered that the process originally did not have the authority to access all files under the root directory, and the process has been maliciously tampered with, resulting in a container escape.
需要说明的是,本申请实施例在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。It should be noted that the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
结合第二方面,在一种可能的实现方式中,方法包括:In conjunction with the second aspect, in a possible implementation manner, the method includes:
宿主机基于进程的地址空间,获取进程对应的缓存区的地址。The host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
结合第二方面,在一种可能的实现方式中,宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址,包括:In conjunction with the second aspect, in a possible implementation, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs includes:
宿主机在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;When the host machine determines that the namespace where the process is located is the initial namespace, it obtains the user ID of the process;
宿主机在确定进程的用户标识为根用户的标识时,获取进程所属的缓存区的地址。When the host machine determines that the user identifier of the process is the identifier of the root user, it obtains the address of the cache area to which the process belongs.
结合第二方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the second aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
结合第二方面,在一种可能的实现方式中,宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址,包括:In conjunction with the second aspect, in a possible implementation, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs includes:
宿主机在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级;When the host determines that the namespace where the process is located is the initial namespace, it obtains the level of the namespace where the process is located;
宿主机在确定层级为零时,获取进程所属的缓存区的地址。When the host determines that the level is zero, it obtains the address of the cache area to which the process belongs.
结合第二方面,在一种可能的实现方式中,在获取进程所属的缓存区的地址之前,方法还包括:In conjunction with the second aspect, in a possible implementation manner, before obtaining the address of the cache area to which the process belongs, the method further includes:
宿主机获取进程的数据结构;数据结构包括进程的挂载点标识;The host obtains the data structure of the process; the data structure includes the mount point identifier of the process;
宿主机在挂载点标识为根目录标识时,确定进程的挂载点为根目录。When the mount point identifier of the host is a root directory identifier, the mount point of the process is determined to be the root directory.
第三方面,本申请实施例提供了一种容器逃逸的检测装置,该装置包括:In a third aspect, an embodiment of the present application provides a device for detecting a container escape, the device comprising:
获取单元,用于在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址;An acquisition unit, used for acquiring an address of a cache area to which the process belongs when it is determined that the namespace where the process is located is the initial namespace;
提示单元,用于在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。The prompt unit is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程所在的命名空间为初始命名空间,即是代表该进程可以init_nsproxy指示的命名空间,为保证命名空间的安全;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于slab_cache可信度较高,因此,将slab_cache与进程所在的命名空间中的地址pid_cache进行比较;在两者不等时,可以确定该进程并没有权限访问init_nsproxy指示的命名空间,也就是说该进程被恶意篡改,出现容器逃逸的情况。该方法的准确性高,可以有效的检测容器逃逸。In the embodiment of the present application, the namespace where the process is located is the initial namespace, that is, the namespace that the process can indicate by init_nsproxy, in order to ensure the security of the namespace; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, and the container escape occurs. This method has high accuracy and can effectively detect container escapes.
需要说明的是,本申请实施例在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。It should be noted that the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
结合第三方面,在一种可能的实现方式中,获取单元用于:In conjunction with the third aspect, in a possible implementation manner, the acquiring unit is used to:
基于进程的地址空间,获取进程对应的缓存区的地址。 Based on the address space of the process, get the address of the cache area corresponding to the process.
本申请实施例中,进程的地址空间pid_cachep不容易被篡改,可信度较高,因此,基于进程的地址空间pid_cachep确定的进程对应的缓存区的地址slab_cache的可信度高。In the embodiment of the present application, the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
结合第三方面,在一种可能的实现方式中,In conjunction with the third aspect, in a possible implementation,
获取单元,用于在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;An acquisition unit, used for acquiring a user ID of the process when it is determined that the namespace where the process is located is the initial namespace;
获取单元,用于在确定用户标识为根用户的标识时,获取进程所属的缓存区的地址。The acquisition unit is used to acquire the address of the cache area to which the process belongs when it is determined that the user identifier is the identifier of the root user.
结合第三方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the third aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
结合第三方面,在一种可能的实现方式中,获取单元,用于:In conjunction with the third aspect, in a possible implementation manner, the acquiring unit is configured to:
在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级level;When it is determined that the namespace where the process is located is the initial namespace, the level of the namespace where the process is located is obtained;
在确定层级为零时,获取进程所属的缓存区的地址。When it is determined that the level is zero, the address of the buffer area to which the process belongs is obtained.
结合第三方面,在一种可能的实现方式中,所述装置还包括确定单元,In combination with the third aspect, in a possible implementation manner, the device further includes a determining unit,
获取单元,用于获取进程的数据结构;数据结构包括进程所在的命名空间的标识;An acquisition unit, used for acquiring a data structure of a process; the data structure includes an identifier of a namespace where the process is located;
确定单元,用于在进程所在的命名空间的标识nsproxy等于初始命名空间的标识init_nsproxy时,确定进程所在的命名空间是初始命名空间。The determination unit is used to determine that the namespace where the process is located is the initial namespace when the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace.
第四方面,本申请实施例提供了一种容器逃逸的检测装置,该装置包括:In a fourth aspect, an embodiment of the present application provides a device for detecting a container escape, the device comprising:
获取单元,用于在确定进程的挂载点为根目录时,获取进程所属的缓存区的地址;An acquisition unit, used for acquiring an address of a cache area to which the process belongs when it is determined that the mount point of the process is a root directory;
提示单元,用于在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。The prompt unit is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程所在的命名空间nsproxy为初始命名空间init_nsproxy,即是代表该进程可以init_nsproxy指示的命名空间,为保证命名空间的安全;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于slab_cache可信度较高,因此,将slab_cache与进程所在的命名空间中的地址pid_cache进行比较;在两者不等时,可以确定该进程并没有权限访问init_nsproxy指示的命名空间,也就是说该进程被恶意篡改,出现容器逃逸的情况。该方法的准确性高,可以有效的检测容器逃逸。In the embodiment of the present application, the namespace nsproxy where the process is located is the initial namespace init_nsproxy, which means that the process can be indicated by init_nsproxy. To ensure the security of the namespace, the host can detect the process based on the address slab_cache of the cache area to which the process belongs. Since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, resulting in a container escape. This method has high accuracy and can effectively detect container escapes.
结合第四方面,在一种可能的实现方式中,获取单元用于:In conjunction with the fourth aspect, in a possible implementation manner, the acquiring unit is used to:
基于进程的地址空间,获取进程对应的缓存区的地址。Based on the address space of the process, get the address of the cache area corresponding to the process.
结合第四方面,在一种可能的实现方式中,获取单元用于:In conjunction with the fourth aspect, in a possible implementation manner, the acquiring unit is used to:
在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;When it is determined that the namespace where the process is located is the initial namespace, the user ID of the process is obtained;
在确定进程的用户标识为根用户的标识时,获取进程所属的缓存区的地址。When it is determined that the user identifier of the process is the identifier of the root user, an address of the buffer area to which the process belongs is obtained.
结合第四方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the fourth aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the identifier of the root user is zero.
结合第四方面,在一种可能的实现方式中,获取单元用于:In conjunction with the fourth aspect, in a possible implementation manner, the acquiring unit is used to:
在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级;When it is determined that the namespace where the process is located is the initial namespace, the level of the namespace where the process is located is obtained;
在确定层级为零时,获取进程所属的缓存区的地址。When it is determined that the level is zero, the address of the buffer area to which the process belongs is obtained.
结合第四方面,在一种可能的实现方式中,所述装置还包括确定单元,In conjunction with the fourth aspect, in a possible implementation manner, the device further includes a determining unit,
获取单元,用于获取进程的数据结构;数据结构包括进程的挂载点标识;An acquisition unit, used for acquiring a data structure of a process; the data structure includes a mount point identifier of the process;
确定单元,用于在挂载点标识为根目录标识时,确定进程的挂载点为根目录。The determination unit is used to determine that the mount point of the process is the root directory when the mount point identifier is the root directory identifier.
第五方面,本申请实施例提供了一种容器逃逸的检测方法,该方法包括:In a fifth aspect, an embodiment of the present application provides a method for detecting container escape, the method comprising:
宿主机在确定进程的挂载点为根目录时,从进程的数据结构中获取目标数据;When the host determines that the mount point of the process is the root directory, it obtains the target data from the data structure of the process;
宿主机在目标数据符合预设条件时,提示告警信息,告警信息用于提示进程发生容器逃逸。When the target data meets the preset conditions, the host prompts an alarm message, which is used to prompt the process that the container escapes.
本申请实施例中,在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,可以有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。In an embodiment of the present application, a mount point-based detection mechanism is added to the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
结合第五方面,在一种可能的实现方式中,目标数据为所在命名空间的层级;宿主机在目标数据符合预设条件时,提示告警信息,包括:In conjunction with the fifth aspect, in a possible implementation, the target data is a level of the namespace in which it is located; when the target data meets a preset condition, the host machine prompts an alarm message, including:
宿主机在确定进程所在命名空间的层级为零时,提示告警信息。When the host determines that the level of the namespace where the process is located is zero, it prompts a warning message.
结合第五方面,在一种可能的实现方式中,目标数据为进程的用户标识;宿主机在目标数据符合预 设条件时,提示告警信息,包括:In conjunction with the fifth aspect, in a possible implementation, the target data is a user identifier of the process; the host machine When setting conditions, warning information is prompted, including:
宿主机在确定进程的用户标识为根用户的标识时,提示告警信息。When the host machine determines that the user ID of the process is the ID of the root user, it prompts a warning message.
结合第五方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the fifth aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
结合第五方面,在一种可能的实现方式中,方法还包括:In conjunction with the fifth aspect, in a possible implementation manner, the method further includes:
宿主机获取进程的数据结构;数据结构包括进程的挂载点标识;The host obtains the data structure of the process; the data structure includes the mount point identifier of the process;
宿主机在挂载点标识为根目录标识时,确定进程的挂载点为根目录。When the mount point identifier of the host is a root directory identifier, the mount point of the process is determined to be the root directory.
第六方面,本申请实施例提供了一种容器逃逸的检测装置,该装置包括:In a sixth aspect, an embodiment of the present application provides a device for detecting a container escape, the device comprising:
获取单元,用于在确定进程的挂载点为根目录时,从进程的数据结构中获取目标数据;An acquisition unit, used for acquiring target data from a data structure of the process when it is determined that the mount point of the process is a root directory;
提示单元,用于宿主机在目标数据符合预设条件时,提示告警信息,告警信息用于提示进程发生容器逃逸。The prompt unit is used for the host machine to prompt an alarm message when the target data meets the preset conditions. The alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,可以有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。In an embodiment of the present application, a mount point-based detection mechanism is added to the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
结合第六方面,在一种可能的实现方式中,目标数据为所在命名空间的层级;提示单元用于:In conjunction with the sixth aspect, in a possible implementation manner, the target data is a level of the namespace in which it is located; and the prompt unit is used to:
在确定进程所在命名空间的层级为零时,提示告警信息。When it is determined that the level of the namespace where the process is located is zero, a warning message is prompted.
结合第六方面,在一种可能的实现方式中,目标数据为进程的用户标识;提示单元用于:In conjunction with the sixth aspect, in a possible implementation manner, the target data is a user identifier of the process; and the prompting unit is used to:
宿主机在确定进程的用户标识为根用户的标识时,提示告警信息。When the host machine determines that the user ID of the process is the ID of the root user, it prompts a warning message.
结合第六方面,在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In combination with the sixth aspect, in a possible implementation manner, the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
结合第六方面,在一种可能的实现方式中,装置还包括确定单元:In conjunction with the sixth aspect, in a possible implementation manner, the apparatus further includes a determining unit:
获取单元,用于获取进程的数据结构;数据结构包括进程的挂载点标识;An acquisition unit, used for acquiring a data structure of a process; the data structure includes a mount point identifier of the process;
确定单元,用于宿主机在挂载点标识为根目录标识时,确定进程的挂载点为根目录。The determination unit is used to determine that the mount point of the process is the root directory when the mount point identifier of the host machine is the root directory identifier.
第七方面,本申请实施例提供了一种电子设备,包括一个或多个功能模块,该一个或多个功能模块可用于执行如上述第一方面中任一项可能的实现方式中的容器逃逸的检测方法。In a seventh aspect, an embodiment of the present application provides an electronic device, comprising one or more functional modules, which can be used to execute a method for detecting container escape as in any possible implementation of the first aspect described above.
第八方面,本申请提供了一种计算机存储介质,包括计算机指令,当计算机指令在电子设备上运行时,使得通信装置执行上述任一方面任一项可能的实现方式中的容器逃逸的检测方法。In an eighth aspect, the present application provides a computer storage medium, comprising computer instructions, which, when executed on an electronic device, causes a communication device to execute a method for detecting container escape in any possible implementation of any of the above aspects.
第九方面,本申请提供了一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行上述任一方面任一项可能的实现方式中的容器逃逸的检测方法。In a ninth aspect, the present application provides a computer program product, which, when executed on a computer, enables the computer to execute a method for detecting container escape in any possible implementation of any of the above aspects.
第十方面,本申请提供了一种芯片,包括:处理器和接口,所述处理器和接口相互配合,使得所述芯片执行上述任一方面任一项可能的实现方式中的容器逃逸的检测方法。In a tenth aspect, the present application provides a chip, comprising: a processor and an interface, wherein the processor and the interface cooperate with each other so that the chip executes the container escape detection method in any possible implementation of any of the above aspects.
可以理解地,上述第七方面提供的电子设备、第八方面提供的计算机可读存储介质、第九方面提供的计算机程序产品、第十方面提供的芯片均用于执行本申请实施例所提供的方法。因此,其所能达到的有益效果可参考对应方法中的有益效果,此处不再赘述。It is understandable that the electronic device provided in the seventh aspect, the computer-readable storage medium provided in the eighth aspect, the computer program product provided in the ninth aspect, and the chip provided in the tenth aspect are all used to execute the method provided in the embodiment of the present application. Therefore, the beneficial effects that can be achieved can refer to the beneficial effects in the corresponding method, which will not be repeated here.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是本申请实施例提供的一种命名空间的层级关系图;FIG1 is a hierarchical relationship diagram of a namespace provided in an embodiment of the present application;
图2是本申请实施例提供的一种检测过程的示意图;FIG2 is a schematic diagram of a detection process provided in an embodiment of the present application;
图3是本申请实施例提供的一种检测系统的示意图;FIG3 is a schematic diagram of a detection system provided in an embodiment of the present application;
图4是本申请实施例提供的一种容器逃逸的场景示意图;FIG4 is a schematic diagram of a scenario of container escape provided by an embodiment of the present application;
图5是本申请实施例提供的另一种容器逃逸的场景示意图;FIG5 is a schematic diagram of another scenario of container escape provided by an embodiment of the present application;
图6是本申请实施例提供的一种容器逃逸的检测方法的流程图;FIG6 is a flow chart of a method for detecting container escape provided in an embodiment of the present application;
图7是本申请实施例提供的另一种容器逃逸的检测方法的流程图;FIG7 is a flow chart of another method for detecting container escape provided in an embodiment of the present application;
图8A是本申请实施例提供的一种容器逃逸的检测方法的流程图;FIG8A is a flow chart of a method for detecting container escape provided in an embodiment of the present application;
图8B是本申请实施例提供的另一种容器逃逸的检测方法的流程图;FIG8B is a flow chart of another method for detecting container escape provided in an embodiment of the present application;
图9A是本申请实施例提供的一种容器逃逸的检测方法的流程图;FIG9A is a flow chart of a method for detecting container escape provided in an embodiment of the present application;
图9B是本申请实施例提供的另一种容器逃逸的检测方法的流程图; 9B is a flow chart of another method for detecting container escape provided in an embodiment of the present application;
图10是本申请实施例提供的一种容器逃逸的检测装置100的结构示意图;FIG. 10 is a schematic structural diagram of a container escape detection device 100 provided in an embodiment of the present application;
图11是本申请实施例提供的一种容器逃逸的检测装置110的结构示意图;FIG. 11 is a schematic structural diagram of a container escape detection device 110 provided in an embodiment of the present application;
图12是本申请实施例提供的另一种容器逃逸的检测装置120的结构示意图。FIG. 12 is a schematic structural diagram of another container escape detection device 120 provided in an embodiment of the present application.
具体实施方式Detailed ways
下面将结合附图对本申请实施例中的技术方案进行清楚、详尽地描述。其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;文本中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,另外,在本申请实施例的描述中,“多个”是指两个或多于两个。The technical solutions in the embodiments of the present application will be described clearly and in detail below in conjunction with the accompanying drawings. In the description of the embodiments of the present application, unless otherwise specified, "/" means or, for example, A/B can mean A or B; "and/or" in the text is only a description of the association relationship of the associated objects, indicating that there can be three relationships, for example, A and/or B can mean: A exists alone, A and B exist at the same time, and B exists alone. In addition, in the description of the embodiments of the present application, "multiple" means two or more than two.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为暗示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征,在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。In the following, the terms "first" and "second" are used for descriptive purposes only and are not to be understood as suggesting or implying relative importance or implicitly indicating the number of the indicated technical features. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of the features, and in the description of the embodiments of the present application, unless otherwise specified, "plurality" means two or more.
下面先介绍本申请涉及的技术术语。The following is an introduction to the technical terms involved in this application.
(一)容器技术1. Container Technology
1.容器:是基于镜像创建的实例,该实例是包括实现该镜像的功能所需要的用户配置和运行配置的对象。容器是镜像创建的运行实例,它可以被启动、开始、停止、删除,每个容器都是相互隔离的且保证安全的平台。容器为运行中的程序提供隔离环境,容器严格控制其中的程序所能访问的资源。1. Container: An instance created based on an image. The instance is an object that includes the user configuration and running configuration required to implement the functions of the image. A container is a running instance created by an image. It can be started, started, stopped, and deleted. Each container is isolated from each other and ensures security. The container provides an isolated environment for running programs, and the container strictly controls the resources that the programs in it can access.
容器为应用程序提供了隔离的运行空间:每个容器内都包含一个独享的完整用户环境空间,并且一个容器内的变动不会影响其他容器的运行环境。为了能达到这种效果,容器技术使用了一系列的系统级别的机制诸如利用Linux namespaces(命名空间)来进行空间隔离,通过文件系统的挂载点来决定容器可以访问哪些文件,通过cgroups来确定每个容器可以利用多少资源。此外容器之间共享同一个系统内核,这样当同一个库被多个容器使用时,内存的使用效率会得到提升。Containers provide isolated operating spaces for applications: each container contains a complete user environment space that is exclusive to it, and changes in one container will not affect the operating environment of other containers. To achieve this effect, container technology uses a series of system-level mechanisms such as using Linux namespaces to isolate spaces, using file system mount points to determine which files containers can access, and using cgroups to determine how many resources each container can use. In addition, containers share the same system kernel, so when the same library is used by multiple containers, memory usage efficiency will be improved.
2.容器逃逸:指这样的一种过程和结果:首先,攻击者通过劫持容器化业务逻辑,或直接控制(CaaS等合法获得容器控制权的场景)等方式,已经获得了容器内某种权限下的命令执行能力;攻击者利用这种命令执行能力,借助一些手段进一步获得该容器所在直接宿主机(经常见到“物理机运行虚拟机,虚拟机再运行容器”的场景,该场景下的直接宿主机指容器外层的虚拟机)上某种权限下的命令执行能力。2. Container escape: refers to the following process and results: First, the attacker has obtained the ability to execute commands under certain permissions in the container by hijacking the containerized business logic or directly controlling it (CaaS and other scenarios where container control is legally obtained). The attacker uses this command execution capability to further obtain the ability to execute commands under certain permissions on the direct host machine where the container is located (it is common to see the scenario of "physical machines running virtual machines, and virtual machines running containers". The direct host machine in this scenario refers to the virtual machine outside the container).
(二)命名空间(namespace)(2) Namespace
1.namespace主要用于资源隔离,不同namespace的资源不可见。namespace主要实现的资源有:IPC/NetWork/Mount/PID/User/UTS/Cgroup。1. Namespace is mainly used for resource isolation. Resources in different namespaces are not visible. The main resources implemented by namespace are: IPC/NetWork/Mount/PID/User/UTS/Cgroup.
目前实现的有六种不同的命名空间,分别为mount命名空间、UTS命名空间、进程间通信(inter-process communication,IPC)命名空间、用户命名空间、进程标识(process identifier,PID)命名空间、网络命名空间。命名空间简单来说提供的是对全局资源的一种抽象,将资源放到不同的命名空间中,各个命名空间中的资源是相互隔离的,各个命名空间中的资源被不同的进程调用。将资源放到不同的容器中(不同的命名空间),各容器彼此隔离。Currently, there are six different namespaces implemented, namely mount namespace, UTS namespace, inter-process communication (IPC) namespace, user namespace, process identifier (PID) namespace, and network namespace. In simple terms, namespace provides an abstraction of global resources. Resources are placed in different namespaces. The resources in each namespace are isolated from each other and are called by different processes. Resources are placed in different containers (different namespaces), and each container is isolated from each other.
2.命名空间2. Namespace
(1)、PID号是内核唯一区分每个进程的标识。(1) The PID number is the kernel's unique identifier for distinguishing each process.
PID号是Linux中在其命名空间中唯一标识进程而分配给它的一个号码,称做进程ID号,简称PID。在使用fork或clone系统调用时产生的进程均会由内核分配一个新的唯一的PID值。The PID number is a number assigned to a process in Linux to uniquely identify it in its namespace. It is called the process ID number, or PID for short. When using the fork or clone system call, the process generated will be assigned a new unique PID value by the kernel.
(2)、命名空间还有层级关系。(2) Namespaces also have a hierarchical relationship.
进程所在的namespace的层级(level):代表当前命名空间的等级,初始命名空间的level为0,它的子命名空间level为1,依次递增,而且子命名空间对父命名空间是可见的。从给定的level设置,内核即可推断进程会关联到多少个ID。The level of the namespace where the process is located: represents the level of the current namespace. The initial namespace has a level of 0, and its sub-namespace has a level of 1, and so on. The sub-namespace is visible to the parent namespace. From a given level setting, the kernel can infer how many IDs the process will be associated with.
namespace还有层级的概念,level表示不同的namespace处于的层级。level表示该namespace处于哪一层。当通过clone函数或者fork函数创建子进程时,可以指定是否创建一个新的namespce,如果不指定,默认集成父进程的namespce;否则会创建新的namespce,同时task_struct中的level加1。Namespace also has the concept of hierarchy. Level indicates the hierarchy of different namespaces. Level indicates which layer the namespace is in. When creating a child process through the clone function or fork function, you can specify whether to create a new namespce. If not specified, the namespce of the parent process is integrated by default; otherwise, a new namespce is created and the level in task_struct is increased by 1.
高级别的namespace可以被低级别的namespace看到高级别的进程有多个PID号。比如说系统默认namespace镜子为level0,在level0下创建了一个新的namespace叫做level1,在level中运行了一个进程。 在level中这个进程的PID号为1,因为高级别的PID namespace需要被低级别的PID namespace看见,所以这个进程中在leve0中为PID xxx,而xxx这个PID编号是按照leve0里的PID序列而分配。A high-level namespace can be seen by a low-level namespace. A high-level process has multiple PID numbers. For example, the system defaults to namespace mirroring level0, and a new namespace called level1 is created under level0, and a process is run in level. The PID number of this process in level is 1. Because the higher-level PID namespace needs to be seen by the lower-level PID namespace, this process is PID xxx in level0, and the PID number xxx is assigned according to the PID sequence in level0.
图1示例性示出了命名空间的层级关系图。FIG. 1 exemplarily shows a hierarchical relationship diagram of a namespace.
如图1所示,图1中示例性示出了有四个命名空间,一个父命名空间衍生了两个子命名空间;其中,父命名空间的层级为0(即level 0);个子命名空间的层级为1(即level 1)。以命名空间为例,由于各个命名空间彼此隔离,所以每个命名空间都可以有PID号为1的进程;但又由于命名空间的层级性,父命名空间是知道子命名空间的存在,子命名空间要映射到父命名空间中去,因此图1中层级1中两个子命名空间的六个进程分别映射到其父命名空间的PID号5~10。As shown in Figure 1, there are four namespaces, one parent namespace derives two child namespaces; the parent namespace has a level 0 (i.e., level 0); the child namespace has a level 1 (i.e., level 1). Taking the namespace as an example, since each namespace is isolated from each other, each namespace can have a process with PID number 1; but due to the hierarchical nature of the namespace, the parent namespace knows the existence of the child namespace, and the child namespace must be mapped to the parent namespace. Therefore, the six processes of the two child namespaces in level 1 in Figure 1 are mapped to PID numbers 5 to 10 of their parent namespaces respectively.
(三)进程(III) Process
1.进程是正在执行的一个程序或命令,每个进程都是一个运行的实体,都有自己的地址空间,并占用一定的系统资源。程序一旦运行就是进程。1. A process is a program or command being executed. Each process is a running entity, has its own address space, and occupies certain system resources. Once a program is running, it is a process.
进程可以看成程序执行的一个实例。进程是系统资源分配的独立实体,每个进程都拥有独立的地址空间。一个进程无法访问另一个进程的变量和数据结构,如果想让一个进程访问另一个进程的资源,需要使用进程间通信,比如管道,文件,套接字等。A process can be seen as an instance of program execution. A process is an independent entity allocated by system resources, and each process has an independent address space. A process cannot access the variables and data structures of another process. If you want a process to access the resources of another process, you need to use inter-process communication, such as pipes, files, sockets, etc.
进程由正文段,用户数据段和系统数据段所组成的一个动态实体。系统数据段存放着进程的控制信息。其中包括进程控制块(process control block,PCB)。A process is a dynamic entity consisting of a text segment, a user data segment, and a system data segment. The system data segment stores the control information of the process, including the process control block (PCB).
2.PCB是用来描述和控制进程的运行的数据结构,是进程实体的一部分,是操作系统中最重要的记录型数据结构。一般情况下,PCB中包含以下内容:2. PCB is a data structure used to describe and control the operation of a process. It is part of the process entity and the most important record-type data structure in the operating system. Generally, PCB contains the following:
(1)进程标识符(内部,外部):用于唯一地标识一个进程;(1) Process identifier (internal, external): used to uniquely identify a process;
(2)处理机的信息(通用寄存器,指令计数器,程序状态字PSW,用户的栈指针);(2) Processor information (general registers, instruction counter, program status word PSW, user stack pointer);
(3)进程调度信息(进程状态,进程的优先级,进程调度所需的其它信息,事件);(3) Process scheduling information (process status, process priority, other information required for process scheduling, events);
(4)进程控制信息(程序的数据的地址,资源清单,进程同步和通信机制,链接指针);(4) Process control information (address of program data, resource list, process synchronization and communication mechanism, link pointer);
数据结构中定义的内容是为后面的管理提供支持的,所以不同的操作系统根据自己的特点又对PCB的内容做了一些调整。不同的操作系统的PCB结构不同。The content defined in the data structure provides support for subsequent management, so different operating systems have made some adjustments to the content of the PCB according to their own characteristics. Different operating systems have different PCB structures.
Linux的进程控制块为一个由结构task_struct所定义的数据结构,其中包括管理进程所需的各种信息。Linux系统的所有进程控制块组织成结构数组形式。The Linux process control block is a data structure defined by the structure task_struct, which includes various information required to manage the process. All process control blocks of the Linux system are organized into a structure array.
3.Linux的进程控制块(task_struct)3. Linux process control block (task_struct)
task_struct是进程存在的唯一标识,也是Linux进程实体的核心。task_struct is the unique identifier of the process and the core of the Linux process entity.
Linux内核使用task_struct数据结构来关联所有与进程有关的数据和结构,Linux内核所有涉及到进程和程序的所有算法都是围绕该数据结构建立的,是内核中最重要的数据结构之一。The Linux kernel uses the task_struct data structure to associate all process-related data and structures. All algorithms in the Linux kernel involving processes and programs are built around this data structure, which is one of the most important data structures in the kernel.
在创建一个新进程时,系统在内存中申请一个空的task_struct区,即空闲PCB块,并填入所需信息。同时将指向该结构的指针填入到task[]数组中。当前处于运行状态进程的PCB用指针数组current_set[]来指出。这是因为Linux支持多处理机系统,系统内可能存在多个同时运行的进程,故current_set定义成指针数组。When creating a new process, the system applies for an empty task_struct area in memory, that is, an idle PCB block, and fills in the required information. At the same time, the pointer to the structure is filled into the task[] array. The PCB of the currently running process is pointed out by the pointer array current_set[]. This is because Linux supports multi-processor systems, and there may be multiple processes running simultaneously in the system, so current_set is defined as a pointer array.
(四)挂载(IV) Mounting
Mount namespace用来隔离文件系统的挂载点,使得不同的mount namespace拥有自己独立的挂载点信息,不同的namespace之间不会相互影响。Mount namespace is used to isolate the mount points of the file system, so that different mount namespaces have their own independent mount point information, and different namespaces will not affect each other.
1.挂载:通常是将一个存储设备挂接到一个已经存在的目录上,访问这个目录就是访问该存储设备的内容。1. Mounting: usually mounting a storage device to an existing directory. Accessing this directory means accessing the contents of the storage device.
对于Linux系统来说,一切皆文件,所有文件都放在以根目录为起点的树形目录结构中,任何硬件设备也都是文件形式。例如,Linux想要使用U盘的硬件设备,必须将Linux本身的目录和硬件设备的文件目录合二为一,此过程就称之为挂载。For Linux systems, everything is a file. All files are placed in a tree-like directory structure starting from the root directory. Any hardware device is also in the form of a file. For example, if Linux wants to use a USB flash drive hardware device, it must combine the Linux directory and the hardware device's file directory into one. This process is called mounting.
2.挂载点:挂载操作会隐藏原本Linux目录中的文件,因此选择Linux本身的目录,最好是新建空操作目录用于挂载,挂载之后,这个操作目录被称为挂载点。2. Mount point: The mount operation will hide the files in the original Linux directory, so choose the Linux directory itself. It is best to create a new empty operation directory for mounting. After mounting, this operation directory is called a mount point.
3.根目录3. Root Directory
Linux系统的根目录(/):Linux和UNIX的文件系统是一个以“/”为根的阶层式的树状文件结构,“/”因此被称为根目录。每一个文件和目录都从这里开始。The root directory of the Linux system (/): The file system of Linux and UNIX is a hierarchical tree file structure with "/" as the root, so "/" is called the root directory. Every file and directory starts from here.
只有root用户具有该目录下的写权限。此目录和/root目录不同,/root目录是root用户的主目录。 Only the root user has write permission in this directory. This directory is different from the /root directory, which is the home directory of the root user.
在linux中,根目录“/”是位于文件系统目录结构的顶层,是最顶层的目录,所有的文件和目录都置于根目录“/”之下;根目录“/”下面还有“/bin”,“/home”,“/usr”等子目录。In Linux, the root directory “/” is located at the top level of the file system directory structure. It is the top-level directory. All files and directories are placed under the root directory “/”; under the root directory “/” there are also subdirectories such as “/bin”, “/home”, and “/usr”.
(五)Linux用户和用户组5. Linux users and user groups
系统管理器授权每个进程使用一个给定的用户标识符(USER Ientification,UID)。每个被启动的进程都有一个启动该进程的用户UID。子进程拥有和父进程一样的UID。用户可以是某个组的成员,每个组也有一个用户组标识(Group Identification,GID)。The system manager authorizes each process to use a given user identifier (UID). Each started process has a UID of the user who started it. A child process has the same UID as the parent process. A user can be a member of a group, and each group also has a group identifier (GID).
1.用户标识符UID:系统用来区别不同用户的整数。1. User identifier UID: An integer used by the system to distinguish different users.
Linux采用一个32位的整数记录和区分不同的用户。这个区分不同用户的数字被称为User ID,简称UID。Linux系统中用户分为3类,即普通用户、根(root)用户、系统用户。Linux uses a 32-bit integer to record and distinguish different users. This number that distinguishes different users is called User ID, or UID for short. Users in the Linux system are divided into three categories, namely ordinary users, root users, and system users.
其中,根用户即root用户,又称为管理员账户,一般UID为0。根用户可以对普通用户和整个系统进行管理。普通用户是指所有使用Linux系统的真实用户,例如普通用户的UID可以在创建时由管理员指定,如果不指定普通用户的UID可以大于500;用户的UID也可以默认从1000开始顺序编号到60000。系统用户是指系统运行必须有的用户,但并不是真实使用者,也就是说,系统用户是安装过程中自动创建,不具有登录系统的能力。Among them, the root user is also called the administrator account, and its UID is usually 0. The root user can manage ordinary users and the entire system. Ordinary users refer to all real users who use the Linux system. For example, the UID of ordinary users can be specified by the administrator when they are created. If not specified, the UID of ordinary users can be greater than 500; the user's UID can also be numbered sequentially from 1000 to 60000 by default. System users refer to users who are required for the system to run, but are not real users. In other words, system users are automatically created during the installation process and do not have the ability to log in to the system.
2.用户组标识符GID:系统用来区分不同用户组的整数。2. User group identifier GID: An integer used by the system to distinguish different user groups.
在Linux系统中,有用户也有用户组。不同用户组也是用数字来区分,这种用于区分不同用户组的ID被称为Group ID,也就是GID。In Linux systems, there are users and user groups. Different user groups are also distinguished by numbers. This ID used to distinguish different user groups is called Group ID, or GID.
UID和GID由Linux内核负责管理,并通过内核级别的系统调用来决定是否应该为某个请求授予特权。比如当进程试图写入文件时,内核会检查创建进程的UID和GID,以确定它是否有足够的权限修改文件。UID and GID are managed by the Linux kernel and are used by kernel-level system calls to determine whether a request should be granted privileges. For example, when a process attempts to write to a file, the kernel checks the UID and GID of the creating process to determine whether it has sufficient permissions to modify the file.
同一主机上运行的所有容器共享同一个内核(主机的内核)。容器化带来的巨大价值在于所有这些独立的容器(其实是进程)可以共享一个内核。这意味着即使由成百上千的容器运行在docker宿主机上,但内核控制的UID和GID则仍然只有一套。所以同一个UID在宿主机和容器中代表的是同一个用户(即便在不同的地方显示了不同的用户名)。All containers running on the same host share the same kernel (the kernel of the host). The great value brought by containerization is that all these independent containers (actually processes) can share a kernel. This means that even if there are hundreds or thousands of containers running on the Docker host, there is still only one set of UIDs and GIDs controlled by the kernel. So the same UID represents the same user in the host and the container (even if different user names are displayed in different places).
(六)slab是Linux操作系统的一种内存分配机制,slab分配算法采用cache存储内核对象。(VI) Slab is a memory allocation mechanism of the Linux operating system. The slab allocation algorithm uses cache to store kernel objects.
其工作是针对一些经常分配并释放的对象,如进程描述符等,这些对象的大小一般比较小,如果直接采用伙伴系统来进行分配和释放,不仅会造成大量的内存碎片,而且处理速度也太慢。而slab分配器是基于对象进行管理的,相同类型的对象归为一类(如进程描述符就是一类),每当要申请这样一个对象,slab分配器就从一个slab列表中分配一个这样大小的单元出去,而当要释放时,将其重新保存在该列表中,而不是直接返回给伙伴系统,从而避免产生内部碎片。slab分配器并不丢弃已分配的对象,而是释放并把它们保存在内存中。当以后又要请求新的对象时,就可以从slab直接获取而不用重复初始化。Its work is to target some frequently allocated and released objects, such as process descriptors. The size of these objects is generally small. If the buddy system is used directly for allocation and release, it will not only cause a large amount of memory fragmentation, but also the processing speed will be too slow. The slab allocator is managed based on objects. Objects of the same type are classified into one category (such as process descriptors). Whenever such an object is requested, the slab allocator allocates a unit of this size from a slab list, and when it is released, it is saved in the list again instead of being returned directly to the buddy system, thus avoiding internal fragmentation. The slab allocator does not discard the allocated objects, but releases and stores them in memory. When a new object is requested in the future, it can be obtained directly from the slab without repeated initialization.
对象高速缓存的组织,高速缓存的内存区被划分为多个slab,每个slab由一个或多个连续的页框组成,这些页框中既包含已分配的对象,也包含空闲的对象。The organization of the object cache, the memory area of the cache is divided into multiple slabs, each slab consists of one or more consecutive page frames, which contain both allocated objects and free objects.
slab分配算法采用cache存储内核对象。当创建cache时,起初包括若干标记为空闲的对象。对象的数量与slab的大小有关。开始,所有对象都标记为空闲。当需要内核数据结构的对象时,可以直接从cache上直接获取,并将对象初始化为使用。The slab allocation algorithm uses a cache to store kernel objects. When a cache is created, it initially contains a number of objects marked as free. The number of objects depends on the size of the slab. Initially, all objects are marked as free. When an object of a kernel data structure is needed, it can be directly obtained from the cache and the object is initialized for use.
下面考虑内核如何将slab分配给表示进程描述符的对象。在Linux系统中,进程描述符的类型是struct task_struct,其大小约为1.7KB。当Linux内核创建新任务时,它会从cache中获得struct task_struct对象所需要的内存。Cache上会有已分配好的并标记为空闲的struct task_struct对象来满足请求。Next, consider how the kernel allocates slabs to objects that represent process descriptors. In Linux systems, the type of a process descriptor is struct task_struct, which is about 1.7KB in size. When the Linux kernel creates a new task, it obtains the memory required for the struct task_struct object from the cache. There will be a struct task_struct object allocated and marked as free in the cache to satisfy the request.
Linux的slab可有三种状态:Linux slabs can have three states:
满的:slab中的所有对象被标记为使用。Full: All objects in the slab are marked as used.
空的:slab中的所有对象被标记为空闲。Empty: All objects in the slab are marked as free.
部分:slab中的对象有的被标记为使用,有的被标记为空闲。Partial: Some objects in the slab are marked as used, and some are marked as free.
slab分配器首先从部分空闲的slab进行分配。如有,则从空的slab进行分配。如没有,则从物理连续页上分配新的slab,并把它赋给一个cache,然后再从新slab分配空间。The slab allocator first allocates from some free slabs. If there are, it allocates from the empty slab. If not, it allocates a new slab from the physical continuous page, assigns it to a cache, and then allocates space from the new slab.
以下示例性举例一种容器逃逸的情况。The following is an example of a container escape situation.
首先,介绍进程的数据结构task_strcut。 First, let's introduce the process data structure task_strcut.
在Linux系统中,每一个进程在内核中都包括一个包含该进程所有信息的进程控制块PBC,即数据结构(task_strcut)。一个进程的task_strcut包括如下信息:In the Linux system, each process includes a process control block PBC in the kernel, which contains all the information of the process, that is, a data structure (task_strcut). The task_strcut of a process includes the following information:
(1)进程标识:包括PID/UID或GID(1) Process ID: including PID/UID or GID
(2)进程状态(2) Process status
(3)进程调度信息(3) Process scheduling information
(4)进程优先级(4) Process priority
(5)进程通信信息(5) Process communication information
(6)挂载点标识:指示进程可以操作的文件系统fs,该操作包括读写文件(6) Mount point identifier: indicates the file system fs that the process can operate, including reading and writing files
(7)进程所在的命名空间的标识(nsproxy):用于标识该进程所属的namespace(7) The namespace identifier of the process (nsproxy): used to identify the namespace to which the process belongs
其中:in:
1.nsproxy:指针指向namespace相关的域,通过nsproxy可以知道该task_struct属于哪个PIDnamespace。1.nsproxy: The pointer points to the namespace-related domain. Through nsproxy, you can know which PIDnamespace the task_struct belongs to.
由于Linux内核提供了PID namespace等多个namespace,一个进程可能属于多个namespace。为了task_struct的精简,内核引入了nsproxy来统一管理进程所属的namespace。Since the Linux kernel provides multiple namespaces such as PID namespace, a process may belong to multiple namespaces. In order to simplify task_struct, the kernel introduces nsproxy to uniformly manage the namespaces to which processes belong.
nsproxy存储了一组指向各个类型namespace的指针,为进程访问各个namespace起了一个代理的作用。由于可能有多个进程所在的namespace完全一样,nsproxy可以在进程间共享,nsproxy中的count字段负责记录该结构的引用数。nsproxy stores a set of pointers to various namespace types, acting as a proxy for processes to access various namespaces. Since multiple processes may have exactly the same namespace, nsproxy can be shared between processes. The count field in nsproxy is responsible for recording the number of references to the structure.
2.初始命名空间的标识init_nsproxy:系统预定义了一个init_nsproxy,用作默认的nsproxy。init_nsproxy是定义了初始的全局命名空间,其中存储了指向各子系统初始的命名空间对象的指针,具有较高的权限。2. Initial namespace identifier init_nsproxy: The system predefines an init_nsproxy, which is used as the default nsproxy. init_nsproxy defines the initial global namespace, which stores pointers to the initial namespace objects of each subsystem and has higher permissions.
基于以上,攻击者可以利用某些内核漏洞,将进程task_struct中的nsproxy篡改成init_nsproxy,从而完成提权。Based on the above, an attacker can exploit certain kernel vulnerabilities to tamper with the nsproxy in the process task_struct into init_nsproxy, thereby achieving privilege escalation.
接下来,介绍现有一种可以检测上述容器逃逸的现有技术。Next, an existing technology for detecting the escape of the above-mentioned container is introduced.
2019年外部开源了一个初步检测机制,通过检测进程的nsproxy,是否是init_nsproxy,且当前进程是否为root进程,且当前进程的level是否为0,从而检测到逃逸到init namespace的行为。其中,root进程是指以root权限运行的进程。具体检测过程可以参见图2。In 2019, a preliminary detection mechanism was released to the public. It detects the escape to the init namespace by detecting whether the nsproxy of the process is init_nsproxy, whether the current process is a root process, and whether the level of the current process is 0. The root process refers to a process running with root privileges. The specific detection process can be seen in Figure 2.
如图2所示,宿主机可以先获取当前进程的task_strcut,从该task_strcut中获取nsproxy;在nsproxy不是initnsproxy时,结束检测;在nsproxy为initnsproxy时,检查当前进程的UID或GID是否为root用户的UID或GID;在当前进程的UID或GID不是root用户的UID或GID,结束检测;在当前进程的UID或GID为root用户的UID或GID,检查当前进程的命名空间的level是否为0;在当前进程的命名空间的level为0时,结束检测;在当前进程的UID或GID的level不是0时,确定该进程发生容器逃逸,可以提示警示信息(即告警)。As shown in Figure 2, the host machine can first obtain the task_strcut of the current process, and obtain nsproxy from the task_strcut; when nsproxy is not initnsproxy, end the detection; when nsproxy is initnsproxy, check whether the UID or GID of the current process is the UID or GID of the root user; when the UID or GID of the current process is not the UID or GID of the root user, end the detection; when the UID or GID of the current process is the UID or GID of the root user, check whether the level of the namespace of the current process is 0; when the level of the namespace of the current process is 0, end the detection; when the level of the UID or GID of the current process is not 0, it is determined that the process has a container escape, and a warning message (i.e., an alarm) can be prompted.
图2的检测原理是,假设该进程的nsproxy为init nsproxy,且该进程为root进程,level为0,表示是level为0的root进程,由于level为0,可以操作所有子节点的namespace,但是nsproxy数据结构中挂载点的namaspce是单独的,所以这里是获取除了fs以外的其他所有的namespace。The detection principle of Figure 2 is that it is assumed that the nsproxy of the process is init nsproxy, and the process is a root process with a level of 0, indicating that it is a root process with a level of 0. Since the level is 0, the namespace of all child nodes can be operated, but the namaspce of the mount point in the nsproxy data structure is separate, so here all namespaces except fs are obtained.
其中,level表示不同的namespace处于的层级。level表示该namespace处于哪一层。当通过clone或者fork子进程的,可以指定是否创建一个新的namespce,如果不指定,默认集成父进程的namespce,否则会创建新的namespce,同时task_struct中的level加1。Among them, level indicates the level of different namespaces. Level indicates which layer the namespace is in. When cloning or forking a child process, you can specify whether to create a new namespce. If not specified, the namespce of the parent process is integrated by default. Otherwise, a new namespce will be created and the level in task_struct will be increased by 1.
但是,图2所示的方法并不能应对其他一些内核漏洞导致的容器逃逸。例如,攻击者将task_stru ct结构体中的UID和GID改成root用户的UID和GID,且将当前进程的level改成0,可绕过图2所示的检测机制;又例如,当攻击者利用内核漏洞将攻击进程的操作目录(即挂载点)篡改到根目录时,可绕过图2所示的检测机制,实现获取fs的namespace的行为。However, the method shown in Figure 2 cannot deal with container escapes caused by other kernel vulnerabilities. For example, if an attacker changes the UID and GID in the task_struct structure to the UID and GID of the root user and changes the level of the current process to 0, the detection mechanism shown in Figure 2 can be bypassed; for another example, when an attacker exploits a kernel vulnerability to tamper with the operating directory (i.e., the mount point) of the attack process to the root directory, the detection mechanism shown in Figure 2 can be bypassed to achieve the behavior of obtaining the namespace of fs.
下面对本申请实施例的系统架构和业务场景进行描述。需要说明的是,本申请描述的系统架构及业务场景是为了更加清楚的说明本申请的技术方案,并不构成对于本申请提供的技术方案的限定,本领域普通技术人员可知,随着系统架构的演变和新业务场景的出现,本申请提供的技术方案对于类似的技术问题,同样适用。 The system architecture and business scenarios of the embodiments of the present application are described below. It should be noted that the system architecture and business scenarios described in this application are intended to more clearly illustrate the technical solutions of the present application and do not constitute a limitation on the technical solutions provided by the present application. It is known to those skilled in the art that with the evolution of the system architecture and the emergence of new business scenarios, the technical solutions provided by the present application are also applicable to similar technical problems.
为了更加清楚、详细地介绍本申请实施例提供的容器逃逸的检测方法,下面先介绍本申请实施例提供的检测系统。In order to more clearly and in detail introduce the container escape detection method provided in the embodiment of the present application, the detection system provided in the embodiment of the present application is first introduced below.
请参见图3,图3是本申请实施例提供的一种检测系统的示意图。Please refer to Figure 3, which is a schematic diagram of a detection system provided in an embodiment of the present application.
如图3所示,该系统包括物理机、运行在物理机的操作系统之上的一个或多个虚拟机(VM)(图3中仅示出了虚拟机1、虚拟机2、虚拟机3)。As shown in FIG3 , the system includes a physical machine and one or more virtual machines (VMs) running on an operating system of the physical machine (only virtual machine 1 , virtual machine 2 , and virtual machine 3 are shown in FIG3 ).
其中,物理机负责硬件资源的管理、分配,为虚拟机呈现虚拟硬件平台,例如,为虚拟机提供虚拟CPU、内存、虚拟磁盘、虚拟网卡等等。虚拟机中可以创建一个或多个容器,图3中示例性示出了虚拟机1中的两个容器,容器1和容器2;虚拟机2中的两个容器,容器3和容器4;虚拟机3中的两个容器,容器5和容器6。虚拟机可以利用容器为进程提供相对独立和隔离的运行环境,例如,容器1支持进程1的运行,容器2支持进程2的运行。Among them, the physical machine is responsible for the management and allocation of hardware resources, and presents a virtual hardware platform to the virtual machine, for example, providing the virtual machine with a virtual CPU, memory, virtual disk, virtual network card, etc. One or more containers can be created in a virtual machine. FIG3 exemplarily shows two containers in virtual machine 1, container 1 and container 2; two containers in virtual machine 2, container 3 and container 4; and two containers in virtual machine 3, container 5 and container 6. Virtual machines can use containers to provide relatively independent and isolated operating environments for processes. For example, container 1 supports the operation of process 1, and container 2 supports the operation of process 2.
其中:in:
(1)虚拟机:指的是在一台物理计算机上模拟出的一台或者多台虚拟的计算机,这些虚拟机可以像真正的物理计算机那样进行工作。(1) Virtual Machine: refers to one or more virtual computers simulated on a physical computer. These virtual machines can work like real physical computers.
(2)进程(process):是执行指令的实体,利用进程可以使程序运行起来,以执行各种指令。(2) Process: It is an entity that executes instructions. A process can be used to run a program to execute various instructions.
(3)容器:用于为进程提供相对独立和隔离的运行环境,示例的,容器包含独立的文件系统、命名空间、资源视图等。容器实例:进程在容器提供的环境中运行之后,该容器可以称为容器实例。(3) Container: used to provide a relatively independent and isolated operating environment for a process. For example, a container includes an independent file system, namespace, resource view, etc. Container instance: After a process runs in the environment provided by a container, the container can be called a container instance.
在一些实施例中,物理机可以包括:中央处理单元(Central Processing Unit,CPU)、内存、硬盘、主板和3D处理显卡等的硬件(Hard Ware)结构,且基于这些硬件物理机中可以包括虚拟机管理(Virtual Machine Manager,VMM)模块和至少一个虚拟机,而VMM和VM是物理机中的软件模块,其中:In some embodiments, the physical machine may include: a central processing unit (CPU), memory, hard disk, motherboard, and 3D processing graphics card, etc., and based on these hardware, the physical machine may include a virtual machine manager (VMM) module and at least one virtual machine, and the VMM and VM are software modules in the physical machine, wherein:
CPU用于执行各种逻辑调用;VMM用于创建至少一个虚拟机,并将物理机中的物理资源虚拟化成多个虚拟资源以供虚拟机使用;而在每个虚拟机内有独立的存储和计算单元,且每个虚拟机的功能和结构都类似。The CPU is used to execute various logical calls; the VMM is used to create at least one virtual machine and virtualize the physical resources in the physical machine into multiple virtual resources for use by the virtual machines; and each virtual machine has independent storage and computing units, and the functions and structures of each virtual machine are similar.
本申请实施例提供的容器逃逸的检测方法由宿主机执行,该宿主机可以为上述物理机,也可以为上述虚拟机。The container escape detection method provided in the embodiment of the present application is executed by a host machine, which may be the above-mentioned physical machine or the above-mentioned virtual machine.
可以理解的是,图3所示的检测系统只是本申请实施例的示例性的实施方式,本申请实施例中的系统架构包括但不仅限于以上架构。It can be understood that the detection system shown in FIG. 3 is only an exemplary implementation of the embodiment of the present application, and the system architecture in the embodiment of the present application includes but is not limited to the above architecture.
由于宿主机是共享内核,当攻击者利用内核漏洞实现逃逸到特权容器时,通过本申请实施例提供的容器逃逸的检测方法,可以有效的检测出这种逃逸行为,并针对这种行为采取相应的应对措施。其中,特权容器是指具有较高权限的一些namespace,例如init_nsproxy。Since the host machine is a shared kernel, when an attacker exploits a kernel vulnerability to escape to a privileged container, the container escape detection method provided by the embodiment of the present application can effectively detect such escape behavior and take corresponding countermeasures against such behavior. Among them, privileged containers refer to some namespaces with higher permissions, such as init_nsproxy.
接下来,示例性的介绍本申请实施例的容器逃逸的检测方法应用的一些场景。本申请实施例提供的容器逃逸的检测方法可以检测出下述三种场景中的进程出现了容器逃逸。Next, some application scenarios of the method for detecting container escape in the embodiment of the present application are introduced exemplarily. The method for detecting container escape provided in the embodiment of the present application can detect that a process in the following three scenarios has a container escape.
在第一种场景中,攻击者可以利用内核漏洞,将某一进程的UID或GID修改为0,将该进程的nsproxy修改为init_nsproxy,将该进程的命名空间的level修改为0,即UID或GID=0、tsk->nsproxy=init_nsproxy、PID->level=0,在检测程序(display message,dmesg)没有检测日志时,可以成功绕过图2所示的检测;那么,攻击者可以通过该进程获取除了mount命名空间的其他namespace。In the first scenario, an attacker can exploit a kernel vulnerability to change the UID or GID of a process to 0, change the nsproxy of the process to init_nsproxy, and change the level of the process namespace to 0, that is, UID or GID = 0, tsk->nsproxy = init_nsproxy, PID->level = 0. When the detection program (display message, dmesg) has no detection log, the detection shown in Figure 2 can be successfully bypassed; then, the attacker can obtain other namespaces except the mount namespace through the process.
在该场景中,通过图6或图7所示的容器逃逸的检测方法可以检测出该容器逃逸。In this scenario, the container escape can be detected by the container escape detection method shown in FIG. 6 or FIG. 7 .
在第二种场景中,攻击者可以利用内核漏洞,将某一进程的用户标识UID或GID修改为0,将挂载点(tsk->fs)修改为根目录(init_task->fs),成功绕过图2所示的检测;那么,攻击者可以通过该进程获取挂载文件系统的namespace。例如,如图4所示,假设修改前的进程仅可以获取rop目录的内容,在攻击者将其UID或GID修改为0,且将该挂载点修改为根目录,则修改后的进程可以获取挂载文件系统fs,即可以获取fs的命名空间。In the second scenario, an attacker can exploit a kernel vulnerability to modify the user ID or GID of a process to 0, and modify the mount point (tsk->fs) to the root directory (init_task->fs), thus successfully bypassing the detection shown in Figure 2. Then, the attacker can obtain the namespace of the mounted file system through the process. For example, as shown in Figure 4, assuming that the process before modification can only obtain the contents of the rop directory, when the attacker modifies its UID or GID to 0 and changes the mount point to the root directory, the modified process can obtain the mounted file system fs, that is, the namespace of fs.
在该场景中,通过图8A至图9B所示的容器逃逸的检测方法可以检测出该容器逃逸。In this scenario, the container escape can be detected by the container escape detection method shown in Figures 8A to 9B.
在第三种场景中,攻击者可以利用内核漏洞,将某一进程的用户标识UID或GID为0,将该进程的nsproxy修改为init_nsproxy,将该进程的命名空间的level修改为0,即tsk->nsproxy=init_nsproxy、PID->level=0,tsk->fs改为init_task->fs,成功绕过图2所示的检测;那么,攻击者可以通过该进程获取所有namespace。如图5所示,修改后的进程可以获取根用户的权限;进而,获取宿主机的主机名(hostname);获取宿主机的网卡信息;获取宿主机的文件挂载。 In the third scenario, an attacker can exploit a kernel vulnerability to change the user ID or GID of a process to 0, modify the nsproxy of the process to init_nsproxy, and modify the level of the namespace of the process to 0, that is, tsk->nsproxy = init_nsproxy, PID->level = 0, and tsk->fs to init_task->fs, successfully bypassing the detection shown in Figure 2; then, the attacker can obtain all namespaces through the process. As shown in Figure 5, the modified process can obtain the root user's permissions; further, obtain the host name (hostname) of the host machine; obtain the host machine's network card information; and obtain the host machine's file mount.
在该场景中,通过图6或图7所示的容器逃逸的检测方法可以检测出该容器逃逸。In this scenario, the container escape can be detected by the container escape detection method shown in FIG. 6 or FIG. 7 .
以下详细描述本申请实施例提供的容器逃逸的检测方法。The following is a detailed description of the method for detecting container escape provided by an embodiment of the present application.
请参见图6,图6是本申请实施例提供的一种容器逃逸的检测方法。该方法可以包括以下部分或全部步骤:Please refer to Figure 6, which is a method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S101:宿主机检测进程所在的命名空间是否为初始命名空间。S101: The host machine detects whether the namespace where the process is located is an initial namespace.
在一种实现中,宿主机可以获取进程的数据结构(task_struct),task_struct包括进程所在的命名空间的标识nsproxy;宿主机可以检测进程所在的命名空间是否为初始命名空间(task->nsproxy==init nsproxy);在确定进程所在的命名空间的标识nsproxy等于初始命名空间的标识init_nsproxy时,确定进程所在的命名空间是初始命名空间;在确定进程所在的命名空间为初始命名空间时,执行步骤S102;在确定进程所在的命名空间不是初始命名空间时,执行步骤S104。In one implementation, the host machine can obtain the data structure (task_struct) of the process, task_struct including the identifier nsproxy of the namespace where the process is located; the host machine can detect whether the namespace where the process is located is the initial namespace (task->nsproxy == init nsproxy); when it is determined that the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace, it is determined that the namespace where the process is located is the initial namespace; when it is determined that the namespace where the process is located is the initial namespace, step S102 is executed; when it is determined that the namespace where the process is located is not the initial namespace, step S104 is executed.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝(copy namespace)相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S101。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine may start to detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy (copy namespace) related operations, that is, start to execute step S101. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
需要说明的是,nsproxy存储了一组指向各个类型namespace的指针,为进程访问各个namespace起了一个代理的作用。由于可能有多个进程所在的namespace完全一样,nsproxy可以在进程间共享;init_nsproxy存储了指向各子系统初始的命名空间对象的指针,具有较高的权限。该进程可以访问除了挂载文件系统fs外的所有命名空间,为保证命名空间的安全,本申请实施例可以执行步骤S102,继续对该进程进行检测。It should be noted that nsproxy stores a set of pointers to various types of namespaces, acting as a proxy for processes to access various namespaces. Since multiple processes may have exactly the same namespace, nsproxy can be shared between processes; init_nsproxy stores pointers to the initial namespace objects of each subsystem and has higher permissions. The process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application can execute step S102 to continue to detect the process.
S102:宿主机检测进程所属的缓存区的地址是否等于进程所在的命名空间中的地址。S102: The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一种实现中,宿主机可以获取进程的地址空间pid_cachep;进而,基于pid_cachep,确定进程对应的缓存区的地址slab_cache;在slab_cache等于进程所在的命名空间中的地址pid_cache时,确定进程所属的缓存区的地址等于进程所在的命名空间中的地址。In one implementation, the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一些实施例中,宿主机在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,执行步骤S104;在进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,执行步骤S103。In some embodiments, the host executes step S104 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located; and executes step S103 when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located.
需要说明的是,针对利用内核漏洞修改namepsace相关数据结构实现的容器逃逸问题,在内核代码中增加基于slab_cache的检测机制,可以加强检测容器逃逸的行为。It should be noted that, in order to solve the container escape problem by modifying the namepsace related data structure using kernel vulnerabilities, adding a slab_cache-based detection mechanism in the kernel code can enhance the detection of container escape behavior.
S103:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S103: The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine. The embodiment of the present application does not limit the method of prompting warning information by the host machine.
S104:宿主机结束检测。S104: The host machine ends the detection.
在一种实现中,宿主机在确定进程所在的命名空间不是初始命名空间时,结束检测。需要说明的是,进程的nsproxy为init_nsproxy,则该进程可以访问除了挂载文件系统fs外的所有命名空间,为保证命名空间的安全,本申请实施例需要对该进程进行检测;若该进程的nsproxy不为init_nsproxy,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the namespace where the process is located is not the initial namespace. It should be noted that if the nsproxy of the process is init_nsproxy, the process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application needs to detect the process; if the nsproxy of the process is not init_nsproxy, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,结束检测。需要说明的是,进程所属的缓存区的地址的可信度高,在进程所属的缓存区的地址等于进程所在的命名空间中的地址,可以证明该进程确实具备访问进程所在的命名空间的权限,因此可以结束该进程的检测。In another implementation, the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
以上图6所示的方法实施例中包含了很多可能的实现方案,下面结合图7对其中的部分实现方案进行举例说明,需要说明的是,图7未解释到的相关概念、操作或者逻辑关系可以参照图6所示实施例中的相应描述。The method embodiment shown in Figure 6 above includes many possible implementation schemes. Some of the implementation schemes are illustrated below in conjunction with Figure 7. It should be noted that the relevant concepts, operations or logical relationships not explained in Figure 7 can refer to the corresponding descriptions in the embodiment shown in Figure 6.
请参见图7,图7是本申请实施例提供的另一种容器逃逸的检测方法的流程图。该方法可以包括以下部分或全部步骤:Please refer to Figure 7, which is a flow chart of another method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S201:宿主机获取进程的数据结构。 S201: The host machine obtains the data structure of the process.
其中,该数据结构(task_struct)可以包括进程所在的命名空间的标识nsproxy、用户标识UID或GID、进程所在命名空间的层级(level)。The data structure (task_struct) may include an identifier nsproxy of the namespace where the process is located, a user identifier UID or GID, and a level of the namespace where the process is located.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S201。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts executing step S201. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
其中,task_struct是Linux下的进程控制块PCB,PCB里包含着一个进程的所有信息,包括进程的UID等。Among them, task_struct is the process control block PCB under Linux, which contains all the information of a process, including the UID of the process.
需要说明的是,不同的操作系统的进程控制块(process control block,PCB)不同,PCB是用来描述和控制进程的运行的数据结构,是进程实体的一部分,是操作系统中最重要的记录型数据结构;Linux的进程控制块为一个由结构task_struct所定义的数据结构,其中包括管理进程所需的各种信息,如上述nsproxy等。It should be noted that different operating systems have different process control blocks (PCB). PCB is a data structure used to describe and control the operation of a process. It is part of the process entity and the most important record-type data structure in the operating system. The Linux process control block is a data structure defined by the task_struct structure, which includes various information needed to manage the process, such as the nsproxy mentioned above.
S202:宿主机检测进程所在的命名空间是否为初始命名空间。S202: The host machine detects whether the namespace where the process is located is an initial namespace.
在一种实现中,宿主机可以检测进程所在的命名空间是否为初始命名空间(task->nsproxy==init nsproxy),即检查进程task->nsproxy是否与init_proxy相等;在确定进程所在的命名空间的标识nsproxy等于初始命名空间的标识init_nsproxy时,确定进程所在的命名空间是初始命名空间;在确定进程所在的命名空间为初始命名空间时,执行步骤S203;反之,执行步骤S207。In one implementation, the host machine can detect whether the namespace where the process is located is the initial namespace (task->nsproxy == init nsproxy), that is, check whether the process task->nsproxy is equal to init_proxy; when it is determined that the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace, it is determined that the namespace where the process is located is the initial namespace; when it is determined that the namespace where the process is located is the initial namespace, execute step S203; otherwise, execute step S207.
S203:宿主机检测用户标识是否为根用户的标识。S203: The host machine detects whether the user identifier is the root user's identifier.
其中,用户标识可以为用户标识符UID和用户组标识GID中的至少一个;根用户的标识可以为零。The user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
在一些实施例中,宿主机在用户标识符UID和用户组标识GID中的至少一个为零时,执行步骤S204;在确定用户标识不为根用户的标识时,执行步骤S207。In some embodiments, when at least one of the user identifier UID and the user group identifier GID is zero, the host executes step S204; when it is determined that the user identifier is not an identifier of the root user, the host executes step S207.
S204:宿主机检测进程所在命名空间的层级是否为零。S204: The host machine detects whether the level of the namespace where the process is located is zero.
在一些实施例中,宿主机检测进程level是否为零;在进程所在命名空间的层级(level)为零时,执行步骤S205;在确定进程所在命名空间的层级不为零时,执行步骤S206。In some embodiments, the host machine detects whether the process level is zero; when the level of the namespace where the process is located is zero, step S205 is executed; when it is determined that the level of the namespace where the process is located is not zero, step S206 is executed.
其中,进程所在的namespace的层级(level):代表当前命名空间的等级,初始命名空间的level为0,它的子命名空间level为1,依次递增,而且子命名空间对父命名空间是可见的。从给定的level设置,内核即可推断进程会关联到多少个ID。The level of the namespace where the process is located: represents the level of the current namespace. The initial namespace has a level of 0, and its sub-namespace has a level of 1, and they increase in sequence. The sub-namespace is visible to the parent namespace. From a given level setting, the kernel can infer how many IDs the process will be associated with.
S205:宿主机检测进程所属的缓存区的地址是否等于进程所在的命名空间中的地址。S205: The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一种实现中,宿主机可以获取进程的地址空间pid_cachep;进而,基于pid_cachep,确定进程对应的缓存区的地址slab_cache,例如根据进程的pid_cachep找namespace的slab_cache的首地址(virt_to_head_page),也即是,根据进程的pid查找page的首地址;判断是否与当前namespace的pid_cache相同,即检查进程的slab_cache==ns->pid cachep,也就是,检测是否与进程所在namespace的slab_cache相同;在slab_cache等于进程所在的命名空间中的地址pid_cache时,确定进程所属的缓存区的地址等于进程所在的命名空间中的地址。In one implementation, the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process, for example, find the head address (virt_to_head_page) of the slab_cache of the namespace according to the pid_cachep of the process, that is, find the head address of the page according to the pid of the process; determine whether it is the same as the pid_cache of the current namespace, that is, check the slab_cache of the process == ns->pid cachep, that is, check whether it is the same as the slab_cache of the namespace where the process is located; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
其中,page是对象的虚拟地址空间;slab_cache是指向当前page指向的slab管理器,后面内存分配都是基于这个cache分配内存;pid_cachep是指向分配pid的slab的地址。Among them, page is the virtual address space of the object; slab_cache is the slab manager pointed to by the current page, and subsequent memory allocations are all based on this cache; pid_cachep is the address of the slab that points to the allocated pid.
在一些实施例中,宿主机在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,执行步骤S207;在进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,执行步骤S206。In some embodiments, when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, the host executes step S207; when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, the host executes step S206.
需要说明的是,S205为针对slab_cache的检测,原理是(1)namespace的slab_cache创建时no merge属性,不允许相似大小的cache进行merge操作;(2)进程的pid_cachep在分配的时候,直接创建新的slab缓存;(每个进程的pid_cachep不同),也就是说,基于no merge属性,进程在创建时,直接创建新的内存,所以不会存在复用的场景,即每个进程的slab地址肯定不一样。其中,merge是slab分配器的一个特性,当指定merge属性时,分配cache内存时,如果找到大小相似可以复用的cache块,直接采用别名的方式引用,不重新创建。It should be noted that S205 is a detection for slab_cache. The principle is (1) the no merge attribute is set when the namespace slab_cache is created, and caches of similar sizes are not allowed to merge; (2) when the process pid_cachep is allocated, a new slab cache is directly created; (the pid_cachep of each process is different). In other words, based on the no merge attribute, when the process is created, a new memory is directly created, so there will be no reuse scenario, that is, the slab address of each process must be different. Among them, merge is a feature of the slab allocator. When the merge attribute is specified, when allocating cache memory, if a cache block of similar size that can be reused is found, it is directly referenced by aliasing without re-creating.
S206:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S206: The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
在一种实现中,宿主机在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,确定检测到该进程出现逃逸行为;进而,提示告警信息。 In one implementation, when the host machine determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, it determines that the process has been detected to have escaped; then, an alarm message is prompted.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine. The embodiment of the present application does not limit the method of prompting warning information by the host machine.
S207:宿主机结束检测。S207: The host machine ends the detection.
在一种实现中,宿主机在确定进程所在的命名空间不是初始命名空间时,结束检测。需要说明的是,进程的nsproxy为init_nsproxy,则该进程可以访问除了挂载文件系统fs外的所有命名空间,为保证命名空间的安全,本申请实施例需要对该进程进行检测;若该进程的nsproxy不为init_nsproxy,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the namespace where the process is located is not the initial namespace. It should be noted that if the nsproxy of the process is init_nsproxy, the process can access all namespaces except the mounted file system fs. To ensure the security of the namespace, the embodiment of the present application needs to detect the process; if the nsproxy of the process is not init_nsproxy, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定进程的用户标识不为根用户的标识时,结束检测。需要说明的是,进程的用户标识不为根用户的标识时该进程权限较低,容器逃逸的可能性较低,因此结束检测。In another implementation, the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
在再一种实现中,宿主机在确定在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,结束检测。需要说明的是,进程所属的缓存区的地址的可信度高,在进程所属的缓存区的地址等于进程所在的命名空间中的地址,可以证明该进程确实具备访问进程所在的命名空间的权限,因此可以结束该进程的检测。In another implementation, the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
请参见图8A,图8A是本申请实施例提供的一种容器逃逸的检测方法的流程图。该方法可以包括以下部分或全部步骤:Please refer to Figure 8A, which is a flow chart of a method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S301:宿主机检测进程的挂载点是否为根目录。S301: The host machine detects whether the mount point of the process is a root directory.
在一种实现中,宿主机可以获取进程的数据结构(task_struct),task_struct包括进程的挂载点标识fs;宿主机可以检测进程的挂载点标识是否为根目录标识(task->fs==init task.fs),即检查进程的task->fs是否与init_task->fs相等;在确定进程的挂载点标识等于根目录标识时,确定进程的挂载点为根目录;在确定进程的挂载点为根目录时,执行步骤S302;反之,执行步骤S304。In one implementation, the host machine can obtain the data structure (task_struct) of the process, task_struct includes the mount point identifier fs of the process; the host machine can detect whether the mount point identifier of the process is the root directory identifier (task->fs == init task.fs), that is, check whether the task->fs of the process is equal to init_task->fs; when it is determined that the mount point identifier of the process is equal to the root directory identifier, it is determined that the mount point of the process is the root directory; when it is determined that the mount point of the process is the root directory, execute step S302; otherwise, execute step S304.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S301。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S301. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
其中,挂载点为一个操作目录。需要说明的是,挂载操作会隐藏原本Linux目录中的文件,因此选择Linux本身的目录,最好是新建空操作目录用于挂载,挂载之后,这个操作目录被称为挂载点。The mount point is an operation directory. It should be noted that the mount operation will hide the files in the original Linux directory, so it is best to select the Linux directory itself and create an empty operation directory for mounting. After mounting, this operation directory is called a mount point.
挂载点标识用于指示进程可以操作的文件系统fs,该操作包括读写文件等。The mount point identifier is used to indicate the file system fs that the process can operate, including reading and writing files.
需要说明的是,S301为针对init_fs的绕过检测,也即是在内核代码中增加基于fs的检测机制,可以检测逃逸到根目录的逃逸行为。It should be noted that S301 is a bypass detection for init_fs, that is, adding an fs-based detection mechanism in the kernel code to detect the escape behavior of escaping to the root directory.
S302:宿主机检测进程所属的缓存区的地址是否等于进程所在的命名空间中的地址。S302: The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一种实现中,宿主机可以获取进程的地址空间pid_cachep;进而,基于pid_cachep,确定进程对应的缓存区的地址slab_cache;在slab_cache等于进程所在的命名空间中的地址pid_cache时,确定进程所属的缓存区的地址等于进程所在的命名空间中的地址。In one implementation, the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一些实施例中,宿主机在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,执行步骤S304;在进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,执行步骤S303。In some embodiments, the host executes step S304 when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located; and executes step S303 when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located.
S303:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S303: The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine. The embodiment of the present application does not limit the method of prompting warning information by the host machine.
S304:宿主机结束检测。S304: The host machine ends the detection.
在一种实现中,宿主机在确定进程的挂载点不是根目录时,结束检测。需要说明的是,进程的挂载点为根目录,则该进程可以访问除了挂载文件系统fs的所有文件,为保证挂载文件系统fs的所有文件的安全,本申请实施例需要对该进程进行检测;若该进程的挂载点为根目录,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,结束检测。需要说明的是,进程所属的缓存区的地址的可信度高,在进程所属的缓存区的地址等于进程所在的命名空间中的地址,可以证明该进程确实具备访问进程所在的命名空间的权限,因此可以结束该 进程的检测。In another implementation, the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection can be ended. Process detection.
以上图8A所示的方法实施例中包含了很多可能的实现方案,下面结合图8B对其中的部分实现方案进行举例说明,需要说明的是,图8B未解释到的相关概念、操作或者逻辑关系可以参照图8A所示实施例中的相应描述。The method embodiment shown in FIG. 8A above includes many possible implementation schemes. Some of the implementation schemes are illustrated below in conjunction with FIG. 8B. It should be noted that related concepts, operations or logical relationships not explained in FIG. 8B can refer to the corresponding descriptions in the embodiment shown in FIG. 8A.
请参见图8B,图8B是本申请实施例提供的另一种容器逃逸的检测方法的流程图。该方法可以包括以下部分或全部步骤:Please refer to Figure 8B, which is a flow chart of another method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S401:宿主机获取进程的数据结构。S401: The host machine obtains the data structure of the process.
其中,该数据结构(task_struct)可以包括进程的挂载点标识、用户标识UID或GID、进程所在命名空间的层级(level)。The data structure (task_struct) may include a mount point identifier of the process, a user identifier UID or GID, and a level of the namespace where the process is located.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S401。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S401. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
S402:宿主机检测进程的挂载点是否为根目录。S402: The host machine detects whether the mount point of the process is a root directory.
在一种实现中,宿主机可以获取进程的数据结构(task_struct),task_struct包括进程的挂载点标识fs;宿主机可以检测进程的挂载点标识是否为根目录标识(task->fs==init task.fs);在确定进程的挂载点标识等于根目录标识时,确定进程的挂载点为根目录;在确定进程的挂载点为根目录时,执行步骤S403;反之,执行步骤S407。In one implementation, the host machine can obtain the data structure (task_struct) of the process, task_struct including the mount point identifier fs of the process; the host machine can detect whether the mount point identifier of the process is the root directory identifier (task->fs == init task.fs); when it is determined that the mount point identifier of the process is equal to the root directory identifier, the mount point of the process is determined to be the root directory; when it is determined that the mount point of the process is the root directory, execute step S403; otherwise, execute step S407.
S403:宿主机检测用户标识是否为根用户的标识。S403: The host machine detects whether the user identifier is the root user identifier.
在一种实现中,宿主机检查用户标识是否为根用户的标识,也就是检查进程的UID/GID,是否当前进程是否ROOT进程。In one implementation, the host machine checks whether the user identifier is the root user's identifier, that is, checks the UID/GID of the process to see whether the current process is a ROOT process.
其中,用户标识可以为用户标识符UID和用户组标识GID中的至少一个;根用户的标识可以为零。The user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
在一些实施例中,宿主机在用户标识符UID和用户组标识GID中的至少一个为零时,执行步骤S404;在确定用户标识不为根用户的标识时,执行步骤S407。In some embodiments, when at least one of the user identifier UID and the user group identifier GID is zero, the host executes step S404; when it is determined that the user identifier is not an identifier of the root user, the host executes step S407.
S404:宿主机检测进程所在命名空间的层级是否为零。S404: The host machine detects whether the level of the namespace where the process is located is zero.
在一些实施例中,宿主机检测进程level是否为零,即检查进程的nsproxy的level是否是init_proxy的level(0);在进程所在命名空间的层级(level)为零时,执行步骤S405;在确定进程所在命名空间的层级不为零时,执行步骤S406。In some embodiments, the host machine detects whether the process level is zero, that is, checks whether the nsproxy level of the process is the level (0) of init_proxy; when the level of the namespace where the process is located is zero, execute step S405; when it is determined that the level of the namespace where the process is located is not zero, execute step S406.
S405:宿主机检测进程所属的缓存区的地址是否等于进程所在的命名空间中的地址。S405: The host machine detects whether the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
在一种实现中,宿主机可以获取进程的地址空间pid_cachep;进而,基于pid_cachep,确定进程对应的缓存区的地址slab_cache,例如根据进程的pid_cachep找namespace的slab_cache的首地址(virt_to_head_page);判断是否与当前namespace的pid_cache相同,即检查进程的slab_cache==ns->pid cachep;在slab_cache等于进程所在的命名空间中的地址pid_cache时,确定进程所属的缓存区的地址等于进程所在的命名空间中的地址。In one implementation, the host machine can obtain the address space pid_cachep of the process; then, based on pid_cachep, determine the address slab_cache of the cache area corresponding to the process, for example, find the head address (virt_to_head_page) of the namespace slab_cache according to the pid_cachep of the process; determine whether it is the same as the pid_cache of the current namespace, that is, check the process's slab_cache == ns->pid cachep; when slab_cache is equal to the address pid_cache in the namespace where the process is located, determine that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located.
其中,page是对象的虚拟地址空间;slab_cache是指向当前page指向的slab管理器,后面内存分配都是基于这个cache分配内存;pid_cachep是指向分配pid的slab的地址。Among them, page is the virtual address space of the object; slab_cache is the slab manager pointed to by the current page, and subsequent memory allocations are all based on this cache; pid_cachep is the address of the slab that points to the allocated pid.
在一些实施例中,宿主机在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,执行步骤S407;在进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,执行步骤S406。In some embodiments, when the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, the host executes step S407; when the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, the host executes step S406.
需要说明的是,S405为针对slab_cache的检测,原理是(1)namespace的slab_cache创建时no merge属性,不允许相似大小的cache进行merge操作;(2)进程的pid_cachep在分配的时候,直接创建新的slab缓存;(每个进程的pid_cachep不同),也就是说,基于no merge属性,进程在创建时,直接创建新的内存,所以不会存在复用的场景,即每个进程的slab地址肯定不一样。其中,merge是slab分配器的一个特性,当指定merge属性时,分配cache内存时,如果找到大小相似可以复用的cache块,直接采用别名的方式引用,不重新创建。It should be noted that S405 is a detection for slab_cache. The principle is (1) the no merge attribute is set when the namespace slab_cache is created, and caches of similar sizes are not allowed to merge; (2) when the process pid_cachep is allocated, a new slab cache is directly created; (the pid_cachep of each process is different). In other words, based on the no merge attribute, when the process is created, a new memory is directly created, so there will be no reuse scenario, that is, the slab address of each process must be different. Among them, merge is a feature of the slab allocator. When the merge attribute is specified, when allocating cache memory, if a cache block of similar size that can be reused is found, it is directly referenced by aliasing without re-creating.
S406:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S406: The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸 行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host can also use other methods to prompt warning information, such as recording the escape of the current process through the kernel dmesg The embodiment of the present application does not limit the manner in which the host prompts the warning information.
S407:宿主机结束检测。S407: The host machine ends the detection.
在一种实现中,宿主机在确定进程的挂载点不是根目录时,结束检测。需要说明的是,进程的挂载点为根目录,则该进程可以访问除了挂载文件系统fs的所有文件,为保证挂载文件系统fs的所有文件的安全,本申请实施例需要对该进程进行检测;若该进程的挂载点为根目录,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定进程的用户标识不为根用户的标识时,结束检测。需要说明的是,进程的用户标识不为根用户的标识时该进程权限较低,容器逃逸的可能性较低,因此结束检测。In another implementation, the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
在再一种实现中,宿主机在确定在进程所属的缓存区的地址等于进程所在的命名空间中的地址时,结束检测。需要说明的是,进程所属的缓存区的地址的可信度高,在进程所属的缓存区的地址等于进程所在的命名空间中的地址,可以证明该进程确实具备访问进程所在的命名空间的权限,因此可以结束该进程的检测。In another implementation, the host machine ends the detection when it determines that the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located. It should be noted that the address of the cache area to which the process belongs has a high credibility. When the address of the cache area to which the process belongs is equal to the address in the namespace where the process is located, it can be proved that the process does have the permission to access the namespace where the process is located, so the detection of the process can be ended.
在另一些实施例中,宿主机可以同时执行图7和图8A的方法,也即是说,宿主机在获取进程的数据结构后,可以执行步骤S202和S402,在进程满足进程所在的命名空间为初始命名空间和进程的挂载点为根目录中的任意一个时,执行步骤S203至S207或者S403至S407,上述方法包括检测逃逸到fs的namespce和检测slab_cache的检测机制,可以检测因为修改namespace逃逸获取所有namecpace的行为。In other embodiments, the host machine can simultaneously execute the methods of Figures 7 and 8A, that is, after obtaining the data structure of the process, the host machine can execute steps S202 and S402, and when the process satisfies the condition that the namespace of the process is the initial namespace and the mount point of the process is any one of the root directories, execute steps S203 to S207 or S403 to S407. The above method includes a detection mechanism for detecting namespce that escapes to fs and detecting slab_cache, which can detect the behavior of obtaining all namecpaces due to escaping due to modifying the namespace.
请参见图9A,图9A是本申请实施例提供的一种容器逃逸的检测方法的流程图。该方法可以包括以下部分或全部步骤:Please refer to Figure 9A, which is a flow chart of a method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S501:宿主机检测进程的挂载点是否为根目录。S501: The host machine detects whether the mount point of the process is a root directory.
在一种实现中,宿主机可以获取进程的数据结构(task_struct),task_struct包括进程的挂载点标识fs;宿主机可以检测进程的挂载点标识是否为根目录标识(task->fs==init task.fs);在确定进程的挂载点标识等于根目录标识时,确定进程的挂载点为根目录;在确定进程的挂载点为根目录时,执行步骤S502;反之,执行步骤S504。In one implementation, the host machine can obtain the data structure (task_struct) of the process, task_struct including the mount point identifier fs of the process; the host machine can detect whether the mount point identifier of the process is the root directory identifier (task->fs == init task.fs); when it is determined that the mount point identifier of the process is equal to the root directory identifier, the mount point of the process is determined to be the root directory; when it is determined that the mount point of the process is the root directory, execute step S502; otherwise, execute step S504.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S501。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S501. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
其中,挂载点为一个操作目录。需要说明的是,挂载操作会隐藏原本Linux目录中的文件,因此选择Linux本身的目录,最好是新建空操作目录用于挂载,挂载之后,这个操作目录被称为挂载点。The mount point is an operation directory. It should be noted that the mount operation will hide the files in the original Linux directory, so it is best to select the Linux directory itself and create an empty operation directory for mounting. After mounting, this operation directory is called a mount point.
挂载点标识用于指示进程可以操作的文件系统fs,该操作包括读写文件等。The mount point identifier is used to indicate the file system fs that the process can operate, including reading and writing files.
需要说明的是,S501为针对init_fs的绕过检测,也即是在内核代码中增加基于fs的检测机制,可以检测逃逸到根目录的逃逸行为。It should be noted that S501 is a bypass detection for init_fs, that is, adding an fs-based detection mechanism in the kernel code to detect the escape behavior of escaping to the root directory.
S502:宿主机检测进程的数据结构中的目标数据是否符合预设条件。S502: The host machine detects whether the target data in the data structure of the process meets the preset conditions.
在一些实施例中,宿主机可以从进程的数据结构中获取目标数据;进而,检测目标数据是否符合预设条件。In some embodiments, the host machine may obtain target data from the data structure of the process; and then detect whether the target data meets a preset condition.
在一种实现中,目标数据为所在命名空间的层级;则宿主机在确定进程所在命名空间的层级是否为零;在确定进程所在命名空间的层级为零时,确定进程的数据结构中的目标数据符合预设条件,执行步骤S503。需要说明的是,由于进程所在命名空间的层级为零,证明该进程权限较高,因此告警。In one implementation, the target data is the level of the namespace where the process is located; the host determines whether the level of the namespace where the process is located is zero; when it is determined that the level of the namespace where the process is located is zero, it is determined that the target data in the data structure of the process meets the preset conditions, and step S503 is executed. It should be noted that since the level of the namespace where the process is located is zero, it proves that the process has a high authority, so an alarm is issued.
在另一种实现中,目标数据为进程的用户标识;则宿主机可以检测进程的用户标识是否为根用户的标识;在确定进程的用户标识为根用户的标识时,确定进程的数据结构中的目标数据符合预设条件,执行步骤S503。需要说明的是,由于进程的用户标识为根用户的标识,证明该进程权限较高,因此告警。In another implementation, the target data is the user ID of the process; the host machine can detect whether the user ID of the process is the ID of the root user; when it is determined that the user ID of the process is the ID of the root user, it is determined that the target data in the data structure of the process meets the preset conditions, and step S503 is executed. It should be noted that since the user ID of the process is the ID of the root user, it proves that the process has higher authority, so an alarm is issued.
其中,用户标识可以为用户标识符UID和用户组标识GID中的至少一个;根用户的标识可以为零。The user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
在一些实施例中,宿主机在确定进程的数据结构中的目标数据符合预设条件时,执行步骤S504;反之,执行步骤S503。In some embodiments, when the host machine determines that the target data in the data structure of the process meets the preset condition, step S504 is executed; otherwise, step S503 is executed.
S503:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S503: The host machine prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。 In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine. The embodiment of the present application does not limit the method of prompting warning information by the host machine.
S504:宿主机结束检测。S504: The host machine ends the detection.
在一种实现中,宿主机在确定进程的挂载点不是根目录时,结束检测。需要说明的是,进程的挂载点为根目录,则该进程可以访问除了挂载文件系统fs的所有文件,为保证挂载文件系统fs的所有文件的安全,本申请实施例需要对该进程进行检测;若该进程的挂载点为根目录,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定进程的用户标识不为根用户的标识时,结束检测。需要说明的是,进程的用户标识不为根用户的标识时该进程权限较低,容器逃逸的可能性较低,因此结束检测。In another implementation, the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
以上图9A所示的方法实施例中包含了很多可能的实现方案,下面结合图9B对其中的部分实现方案进行举例说明,需要说明的是,图9B未解释到的相关概念、操作或者逻辑关系可以参照图9A所示实施例中的相应描述。The method embodiment shown in FIG. 9A above includes many possible implementation schemes. Some of the implementation schemes are illustrated below in conjunction with FIG. 9B . It should be noted that related concepts, operations or logical relationships not explained in FIG. 9B can refer to the corresponding descriptions in the embodiment shown in FIG. 9A .
请参见图9B,图9B是本申请实施例提供的另一种容器逃逸的检测方法的流程图。该方法可以包括以下部分或全部步骤:Please refer to Figure 9B, which is a flow chart of another method for detecting container escape provided by an embodiment of the present application. The method may include some or all of the following steps:
S601:宿主机获取进程的数据结构。S601: The host machine obtains the data structure of the process.
其中,该数据结构(task_struct)可以包括进程的挂载点标识、用户标识UID或GID、进程所在命名空间的层级(level)。The data structure (task_struct) may include a mount point identifier of the process, a user identifier UID or GID, and a level of the namespace where the process is located.
在一些实施例中,宿主机可以在系统调用当前进程,或者当前进程进行进程调度、进程退出或namespace拷贝相关操作时开始检测当前进程是否发生逃逸行为,即开始执行步骤S601。需要说明的是,本申请可以应用于对于逃逸检测实时性较高的场景。In some embodiments, the host machine can detect whether the current process has escaped when the system calls the current process, or the current process performs process scheduling, process exit or namespace copy related operations, that is, starts to execute step S601. It should be noted that the present application can be applied to scenarios with high real-time escape detection.
S602:宿主机检测进程的挂载点是否为根目录。S602: The host machine detects whether the mount point of the process is a root directory.
在一种实现中,宿主机可以获取进程的数据结构(task_struct),task_struct包括进程的挂载点标识fs;宿主机可以检测进程的挂载点标识是否为根目录标识(task->fs==init task.fs);在确定进程的挂载点标识等于根目录标识时,确定进程的挂载点为根目录;在确定进程的挂载点为根目录时,执行步骤S603;反之,执行步骤S606。In one implementation, the host machine can obtain the data structure (task_struct) of the process, task_struct includes the mount point identifier fs of the process; the host machine can detect whether the mount point identifier of the process is the root directory identifier (task->fs == init task.fs); when it is determined that the mount point identifier of the process is equal to the root directory identifier, the mount point of the process is determined to be the root directory; when it is determined that the mount point of the process is the root directory, execute step S603; otherwise, execute step S606.
S603:宿主机检测用户标识是否为根用户的标识。S603: The host machine detects whether the user identifier is the root user's identifier.
其中,用户标识可以为用户标识符UID和用户组标识GID中的至少一个;根用户的标识可以为零。The user identifier may be at least one of a user identifier UID and a user group identifier GID; the root user identifier may be zero.
在一些实施例中,宿主机在用户标识符UID和用户组标识GID中的至少一个为零时,执行步骤S604;在确定用户标识不为根用户的标识时,执行步骤S606。In some embodiments, when at least one of the user identifier UID and the user group identifier GID is zero, the host executes step S604; when it is determined that the user identifier is not an identifier of the root user, the host executes step S606.
S604:宿主机检测进程所在命名空间的层级是否为零。S604: The host machine detects whether the level of the namespace where the process is located is zero.
在一些实施例中,宿主机检测进程level是否为零;在进程所在命名空间的层级(level)不为零时,执行步骤S605;在确定进程所在命名空间的层级为零时,执行步骤S606。In some embodiments, the host machine detects whether the process level is zero; when the level of the namespace where the process is located is not zero, step S605 is executed; when it is determined that the level of the namespace where the process is located is zero, step S606 is executed.
S605:宿主机提示告警信息,告警信息用于提示进程发生容器逃逸。S605: The host prompts an alarm message, where the alarm message is used to prompt the process that a container escape has occurred.
在一些实施例中,宿主机可以在屏幕上显示警示信息,该警示信息用于指示当前进程发生了容器逃逸。其中,该警示信息可以包括当前进程的信息。In some embodiments, the host machine may display a warning message on the screen, where the warning message is used to indicate that a container escape has occurred in the current process. The warning message may include information about the current process.
需要说明的是,宿主机也可以其他提示警示信息的方法,例如通过内核dmesg记录当前进程的逃逸行为,又例如重启宿主机。本申请实施例对宿主机提示警告信息的方式不做限定。It should be noted that the host machine may also use other methods to prompt warning information, such as recording the escape behavior of the current process through the kernel dmesg, or restarting the host machine. The embodiment of the present application does not limit the method of prompting warning information by the host machine.
S606:宿主机结束检测。S606: The host machine ends the detection.
在一种实现中,宿主机在确定进程的挂载点不是根目录时,结束检测。需要说明的是,进程的挂载点为根目录,则该进程可以访问除了挂载文件系统fs的所有文件,为保证挂载文件系统fs的所有文件的安全,本申请实施例需要对该进程进行检测;若该进程的挂载点为根目录,则宿主机可以结束对该进程的检测。In one implementation, the host machine ends the detection when it determines that the mount point of the process is not the root directory. It should be noted that if the mount point of the process is the root directory, the process can access all files except the mounted file system fs. To ensure the security of all files of the mounted file system fs, the embodiment of the present application needs to detect the process; if the mount point of the process is the root directory, the host machine can end the detection of the process.
在另一种实现中,宿主机在确定进程的用户标识不为根用户的标识时,结束检测。需要说明的是,进程的用户标识不为根用户的标识时该进程权限较低,容器逃逸的可能性较低,因此结束检测。In another implementation, the host machine ends the detection when it determines that the user identifier of the process is not the identifier of the root user. It should be noted that when the user identifier of the process is not the identifier of the root user, the process has low authority and the possibility of container escape is low, so the detection is ended.
上述详细阐述了本申请实施例的方法,下面提供了本申请实施例的装置。The method of the embodiment of the present application is described in detail above, and the device of the embodiment of the present application is provided below.
请参见图10,图10是本申请实施例提供的一种容器逃逸的检测装置100的结构示意图。该装置100 可以包括获取单元1001和提示单元1002,还可以包括确定单元1003。该容器逃逸的检测装置100用于实现前述的容器逃逸的检测方法,例如图6或图7所示的任意一个实施例的容器逃逸的检测方法。Please refer to FIG. 10 , which is a schematic diagram of the structure of a container escape detection device 100 provided in an embodiment of the present application. It may include an acquisition unit 1001 and a prompt unit 1002, and may also include a determination unit 1003. The container escape detection device 100 is used to implement the above-mentioned container escape detection method, such as the container escape detection method of any embodiment shown in FIG. 6 or FIG. 7 .
该容器逃逸的检测装置100包括:The container escape detection device 100 comprises:
获取单元1001,用于在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址;The acquisition unit 1001 is used to acquire the address of the cache area to which the process belongs when it is determined that the namespace where the process is located is the initial namespace;
提示单元1002,用于在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。The prompt unit 1002 is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程所在的命名空间为初始命名空间,即是代表该进程可以init_nsproxy指示的命名空间,为保证命名空间的安全;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于slab_cache可信度较高,因此,将slab_cache与进程所在的命名空间中的地址pid_cache进行比较;在两者不等时,可以确定该进程并没有权限访问init_nsproxy指示的命名空间,也就是说该进程被恶意篡改,出现容器逃逸的情况。该方法的准确性高,可以有效的检测容器逃逸。In the embodiment of the present application, the namespace where the process is located is the initial namespace, that is, the namespace that the process can indicate by init_nsproxy, in order to ensure the security of the namespace; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since slab_cache has a high credibility, slab_cache is compared with the address pid_cache in the namespace where the process is located; when the two are not equal, it can be determined that the process does not have the authority to access the namespace indicated by init_nsproxy, that is, the process has been maliciously tampered with, and the container escape occurs. This method has high accuracy and can effectively detect container escapes.
需要说明的是,本申请实施例在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。It should be noted that the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
在一种可能的实现方式中,获取单元1001用于:In a possible implementation, the acquiring unit 1001 is used to:
基于进程的地址空间,获取进程对应的缓存区的地址。Based on the address space of the process, get the address of the cache area corresponding to the process.
本申请实施例中,进程的地址空间pid_cachep不容易被篡改,可信度较高,因此,基于进程的地址空间pid_cachep确定的进程对应的缓存区的地址slab_cache的可信度高。In the embodiment of the present application, the address space pid_cachep of the process is not easily tampered with and has a high credibility. Therefore, the address slab_cache of the cache area corresponding to the process determined based on the address space pid_cachep of the process has a high credibility.
在一种可能的实现方式中,In one possible implementation,
获取单元1001,用于在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;An acquisition unit 1001 is used to acquire a user identifier of the process when it is determined that the namespace where the process is located is an initial namespace;
获取单元1001,用于在确定用户标识为根用户的标识时,获取进程所属的缓存区的地址。The acquisition unit 1001 is used to acquire the address of the cache area to which the process belongs when it is determined that the user identifier is the identifier of the root user.
在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In a possible implementation, the user identifier is at least one of a user identifier UID and a user group identifier GID; the identifier of the root user is zero.
在一种可能的实现方式中,获取单元1001,用于:In a possible implementation, the acquiring unit 1001 is configured to:
在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级level;When it is determined that the namespace where the process is located is the initial namespace, the level of the namespace where the process is located is obtained;
在确定层级为零时,获取进程所属的缓存区的地址。When it is determined that the level is zero, the address of the buffer area to which the process belongs is obtained.
在一种可能的实现方式中,所述装置还包括确定单元1003,In a possible implementation, the device further includes a determining unit 1003,
获取单元1001,用于获取进程的数据结构;数据结构包括进程所在的命名空间的标识;The acquisition unit 1001 is used to acquire the data structure of the process; the data structure includes the identifier of the namespace where the process is located;
确定单元1003,用于在进程所在的命名空间的标识nsproxy等于初始命名空间的标识init_nsproxy时,确定进程所在的命名空间是初始命名空间。The determining unit 1003 is configured to determine that the namespace where the process is located is the initial namespace when the identifier nsproxy of the namespace where the process is located is equal to the identifier init_nsproxy of the initial namespace.
需要说明的是,各个单元的实现还可以对应参照图6或者图7所示的实施例的相应描述。It should be noted that the implementation of each unit may also correspond to the corresponding description of the embodiment shown in FIG. 6 or FIG. 7 .
可以理解的,本申请各个装置实施例中,对多个单元或者模块的划分仅是一种根据功能进行的逻辑划分,不作为对装置具体的结构的限定。在具体实现中,其中部分功能模块可能被细分为更多细小的功能模块,部分功能模块也可能组合成一个功能模块,但无论这些功能模块是进行了细分还是组合,装置100在配对的过程中所执行的大致流程是相同的。通常,每个单元都对应有各自的程序代码(或者程序指令),这些单元各自对应的程序代码在处理器上运行时,使得该单元相应的流程从而实现相应功能。It is understandable that in each device embodiment of the present application, the division of multiple units or modules is only a logical division based on function, and is not intended to limit the specific structure of the device. In a specific implementation, some functional modules may be subdivided into more small functional modules, and some functional modules may be combined into one functional module, but regardless of whether these functional modules are subdivided or combined, the general process executed by the device 100 during the pairing process is the same. Usually, each unit corresponds to its own program code (or program instruction), and when the program codes corresponding to each of these units are run on the processor, the corresponding process of the unit is implemented to achieve the corresponding function.
请参见图11,图11是本申请实施例提供的一种容器逃逸的检测装置110的结构示意图。该装置110可以包括获取单元1101和提示单元1102,还可以包括确定单元1103。该容器逃逸的检测装置110用于实现前述的容器逃逸的检测方法,例如图8A或图8B所示的任意一个实施例的容器逃逸的检测方法。Please refer to Figure 11, which is a schematic diagram of the structure of a container escape detection device 110 provided in an embodiment of the present application. The device 110 may include an acquisition unit 1101 and a prompt unit 1102, and may also include a determination unit 1103. The container escape detection device 110 is used to implement the aforementioned container escape detection method, such as the container escape detection method of any one of the embodiments shown in Figure 8A or Figure 8B.
该容器逃逸的检测装置110包括:The container escape detection device 110 comprises:
获取单元1101,用于在确定进程的挂载点为根目录时,获取进程所属的缓存区的地址;The acquisition unit 1101 is used to acquire the address of the cache area to which the process belongs when it is determined that the mount point of the process is the root directory;
提示单元1102,用于在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。The prompt unit 1102 is used to prompt an alarm message when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, and the alarm message is used to prompt the process that the container escape occurs.
本申请实施例中,进程的挂载点为根目录,也即是代表该进程可以访问根目录下的所有文件,为保证根目录下的文件的安全,需要对该进程进行检测;因此,宿主机可以基于进程所属的缓存区的地址slab_cache对该进程进行检测;由于进程所属的缓存区的地址可信度较高,因此,将进程所属的缓存区的地址与进程所在的命名空间中的地址pid_cache进行比较,在两者不等时认为该进程原先并没有权限访问根目录下的所有文件,该进程已被恶意篡改,出现容器逃逸的情况。 In the embodiment of the present application, the mount point of the process is the root directory, which means that the process can access all files under the root directory. In order to ensure the security of the files under the root directory, the process needs to be detected; therefore, the host machine can detect the process based on the address slab_cache of the cache area to which the process belongs; since the address of the cache area to which the process belongs is more credible, the address of the cache area to which the process belongs is compared with the address pid_cache in the namespace where the process is located. When the two are not equal, it is considered that the process originally did not have the authority to access all files under the root directory, and the process has been maliciously tampered with, resulting in a container escape.
需要说明的是,本申请实施例在内核代码中增加基于挂载点的检测机制,可以检测逃逸到根目录的行为,有效的监测利用内核漏洞的发生的容器逃逸行为,从而提高系统安全性。It should be noted that the embodiment of the present application adds a mount point-based detection mechanism in the kernel code, which can detect the behavior of escaping to the root directory and effectively monitor the container escape behavior that occurs by exploiting kernel vulnerabilities, thereby improving system security.
在一种可能的实现方式中,获取单元1101用于:In a possible implementation, the acquiring unit 1101 is used to:
基于进程的地址空间,获取进程对应的缓存区的地址。Based on the address space of the process, get the address of the cache area corresponding to the process.
在一种可能的实现方式中,获取单元1101用于:In a possible implementation, the acquiring unit 1101 is used to:
在确定进程所在的命名空间为初始命名空间时,获取进程的用户标识;When it is determined that the namespace where the process is located is the initial namespace, the user ID of the process is obtained;
在确定进程的用户标识为根用户的标识时,获取进程所属的缓存区的地址。When it is determined that the user identifier of the process is the identifier of the root user, an address of the buffer area to which the process belongs is obtained.
在一种可能的实现方式中,用户标识为用户标识符UID和用户组标识GID中的至少一个;根用户的标识为零。In a possible implementation, the user identifier is at least one of a user identifier UID and a user group identifier GID; the identifier of the root user is zero.
在一种可能的实现方式中,获取单元1101用于:In a possible implementation, the acquiring unit 1101 is used to:
在确定进程所在的命名空间为初始命名空间时,获取进程所在命名空间的层级;When it is determined that the namespace where the process is located is the initial namespace, the level of the namespace where the process is located is obtained;
在确定层级为零时,获取进程所属的缓存区的地址。When it is determined that the level is zero, the address of the buffer area to which the process belongs is obtained.
在一种可能的实现方式中,所述装置还包括确定单元1103,In a possible implementation, the device further includes a determining unit 1103,
获取单元1101,用于获取进程的数据结构;数据结构包括进程的挂载点标识;The acquisition unit 1101 is used to acquire the data structure of the process; the data structure includes the mount point identifier of the process;
确定单元1103,用于在挂载点标识为根目录标识时,确定进程的挂载点为根目录。The determining unit 1103 is configured to determine that the mount point of the process is a root directory when the mount point identifier is a root directory identifier.
需要说明的是,各个单元的实现还可以对应参照图8A或者图8B所示的实施例的相应描述。It should be noted that the implementation of each unit may also correspond to the corresponding description of the embodiment shown in FIG. 8A or FIG. 8B .
可以理解的,本申请各个装置实施例中,对多个单元或者模块的划分仅是一种根据功能进行的逻辑划分,不作为对装置具体的结构的限定。在具体实现中,其中部分功能模块可能被细分为更多细小的功能模块,部分功能模块也可能组合成一个功能模块,但无论这些功能模块是进行了细分还是组合,装置100在配对的过程中所执行的大致流程是相同的。通常,每个单元都对应有各自的程序代码(或者程序指令),这些单元各自对应的程序代码在处理器上运行时,使得该单元相应的流程从而实现相应功能。It is understandable that in each device embodiment of the present application, the division of multiple units or modules is only a logical division based on function, and is not intended to limit the specific structure of the device. In a specific implementation, some functional modules may be subdivided into more small functional modules, and some functional modules may be combined into one functional module, but regardless of whether these functional modules are subdivided or combined, the general process executed by the device 100 during the pairing process is the same. Usually, each unit corresponds to its own program code (or program instruction), and when the program codes corresponding to each of these units are run on the processor, the corresponding process of the unit is implemented to achieve the corresponding function.
请参见图12,图12是本申请实施例提供的另一种容器逃逸的检测装置120的结构示意图。该装置120包括至少一个处理器1201,至少一个存储器1202以及至少一个通信接口1203。所述处理器1201、所述存储器1202和所述通信接口1203通过所述通信总线连接并完成相互间的通信。Please refer to Figure 12, which is a schematic diagram of the structure of another container escape detection device 120 provided in an embodiment of the present application. The device 120 includes at least one processor 1201, at least one memory 1202, and at least one communication interface 1203. The processor 1201, the memory 1202, and the communication interface 1203 are connected through the communication bus and communicate with each other.
处理器1201可以是通用中央处理器(CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制以上方案程序执行的集成电路。Processor 1201 can be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more integrated circuits for controlling the execution of the above program.
通信接口1203,用于与其他设备或通信网络通信,如以太网,无线接入网(RAN),无线局域网(Wireless Local Area Networks,WLAN)等。Communication interface 1203 is used to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area network (Wireless Local Area Networks, WLAN), etc.
存储器1202可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(Electrically Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过总线与处理器相连接。存储器也可以和处理器集成在一起。The memory 1202 may be a read-only memory (ROM) or other types of static storage devices that can store static information and instructions, a random access memory (RAM) or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (EEPROM), a compact disc (CD-ROM) or other optical disc storage, optical disc storage (including compressed optical disc, laser disc, optical disc, digital versatile disc, Blu-ray disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store the desired program code in the form of instructions or data structures and can be accessed by a computer, but is not limited thereto. The memory may exist independently and be connected to the processor through a bus. The memory may also be integrated with the processor.
其中,所述存储器1202用于存储执行以上方案的应用程序代码,并由处理器1201来控制执行。所述处理器1201用于执行所述存储器1202中存储的应用程序代码。The memory 1202 is used to store application code for executing the above solution, and the execution is controlled by the processor 1201. The processor 1201 is used to execute the application code stored in the memory 1202.
存储器1202存储的代码可执行以上提供的任一种容器逃逸的检测方法,比如:The code stored in the memory 1202 may execute any of the container escape detection methods provided above, such as:
在确定进程所在的命名空间为初始命名空间时,获取进程所属的缓存区的地址;在确定进程所属的缓存区的地址不等于进程所在的命名空间中的地址时,提示告警信息,告警信息用于提示进程发生容器逃逸。When it is determined that the namespace where the process is located is the initial namespace, the address of the cache area to which the process belongs is obtained; when it is determined that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, an alarm message is prompted, and the alarm message is used to prompt the process that a container escape occurs.
本申请实施例还提供了一种电子设备,电子设备包括一个或多个处理器和一个或多个存储器;其中,一个或多个存储器与一个或多个处理器耦合,一个或多个存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当一个或多个处理器执行计算机指令时,使得电子设备执行上述实施例描述的方法。An embodiment of the present application also provides an electronic device, which includes one or more processors and one or more memories; wherein the one or more memories are coupled to the one or more processors, and the one or more memories are used to store computer program codes, and the computer program codes include computer instructions, and when the one or more processors execute the computer instructions, the electronic device executes the method described in the above embodiment.
本申请实施例还提供了一种包含指令的计算机程序产品,当计算机程序产品在电子设备上运行时, 使得电子设备执行上述实施例描述的方法。The present application also provides a computer program product including instructions. When the computer program product is run on an electronic device, The electronic device is enabled to execute the method described in the above embodiment.
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质可存储有程序,该程序执行时包括上述方法实施例中记载的任何一种容器逃逸的检测装置的部分或全部步骤。An embodiment of the present application further provides a computer storage medium, wherein the computer storage medium may store a program, and when the program is executed, the program includes part or all of the steps of any one of the container escape detection devices recorded in the above method embodiments.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the aforementioned method embodiments, for the sake of simplicity, they are all expressed as a series of action combinations, but those skilled in the art should be aware that the present application is not limited by the described order of actions, because according to the present application, certain steps can be performed in other orders or simultaneously. Secondly, those skilled in the art should also be aware that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference can be made to the relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in the present application, it should be understood that the disclosed device can be implemented in other ways. For example, the device embodiments described above are only schematic, such as the division of the units, which is only a logical function division. There may be other division methods in actual implementation, such as multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, and the indirect coupling or communication connection of the device or unit can be electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable memory. Based on this understanding, the technical solution of the present application, or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, which is stored in a memory and includes several instructions for a computer device (which can be a personal computer, server or network device, etc.) to execute all or part of the steps of the method described in each embodiment of the present application. The aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。A person skilled in the art may understand that all or part of the steps in the various methods of the above embodiments may be completed by instructing related hardware through a program, and the program may be stored in a computer-readable memory, and the memory may include: a flash drive, a read-only memory (ROM), a random access memory (RAM), a disk or an optical disk, etc.
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上上述,本说明书内容不应理解为对本申请的限制。 The embodiments of the present application are introduced in detail above. Specific examples are used in this article to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method and core idea of the present application. At the same time, for general technical personnel in this field, according to the idea of the present application, there will be changes in the specific implementation method and application scope. In summary, the content of this specification should not be understood as a limitation on the present application.

Claims (15)

  1. 一种容器逃逸的检测方法,其特征在于,所述方法包括:A method for detecting container escape, characterized in that the method comprises:
    宿主机在确定进程所在的命名空间为初始命名空间时,获取所述进程所属的缓存区的地址;When the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs;
    所述宿主机在确定所述进程所属的缓存区的地址不等于所述进程所在的命名空间中的地址时,提示告警信息,所述告警信息用于提示所述进程发生容器逃逸。When the host machine determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, it prompts an alarm message, where the alarm message is used to prompt that a container escape occurs to the process.
  2. 根据权利要求1所述的方法,其特征在于,所述获取所述进程所属的缓存区的地址,包括:The method according to claim 1, characterized in that obtaining the address of the cache area to which the process belongs comprises:
    所述宿主机基于所述进程的地址空间,获取所述进程对应的缓存区的地址。The host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
  3. 根据权利要求1或2所述的方法,其特征在于,所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所属的缓存区的地址,包括:The method according to claim 1 or 2 is characterized in that, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs comprises:
    所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程的用户标识;When the host machine determines that the namespace where the process is located is the initial namespace, obtaining the user identifier of the process;
    所述宿主机在确定所述用户标识为根用户的标识时,获取所述进程所属的缓存区的地址。When the host machine determines that the user identifier is an identifier of a root user, it obtains an address of a cache area to which the process belongs.
  4. 根据权利要求3所述的方法,其特征在于,所述用户标识为用户标识符UID和用户组标识GID中的至少一个;所述根用户的标识为零。The method according to claim 3 is characterized in that the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所属的缓存区的地址,包括:The method according to any one of claims 1 to 4 is characterized in that, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs comprises:
    所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所在命名空间的层级;When the host machine determines that the namespace where the process is located is the initial namespace, obtaining the level of the namespace where the process is located;
    所述宿主机在确定所述层级为零时,获取所述进程所属的缓存区的地址。When the host machine determines that the level is zero, it obtains the address of the cache area to which the process belongs.
  6. 根据权利要求1-5任一项所述的方法,其特征在于,在所述获取所述进程所属的缓存区的地址之前,所述方法还包括:The method according to any one of claims 1 to 5, characterized in that before obtaining the address of the cache area to which the process belongs, the method further comprises:
    所述宿主机获取所述进程的数据结构;所述数据结构包括所述进程所在的命名空间的标识;The host machine obtains a data structure of the process; the data structure includes an identifier of a namespace where the process is located;
    所述宿主机在所述进程所在的命名空间的标识等于所述初始命名空间的标识时,确定所述进程所在的命名空间是所述初始命名空间。When the identifier of the namespace where the process is located is equal to the identifier of the initial namespace, the host machine determines that the namespace where the process is located is the initial namespace.
  7. 一种容器逃逸的检测方法,其特征在于,所述方法包括:A method for detecting container escape, characterized in that the method comprises:
    所述宿主机在确定所述进程的挂载点为根目录时,获取所述进程所属的缓存区的地址;When the host machine determines that the mount point of the process is the root directory, obtaining the address of the cache area to which the process belongs;
    所述宿主机在确定所述进程所属的缓存区的地址不等于所述进程所在的命名空间中的地址时,提示告警信息,所述告警信息用于提示所述进程发生容器逃逸。When the host machine determines that the address of the cache area to which the process belongs is not equal to the address in the namespace where the process is located, it prompts an alarm message, where the alarm message is used to prompt that a container escape occurs to the process.
  8. 根据权利要求7所述的方法,其特征在于,所述方法包括:The method according to claim 7, characterized in that the method comprises:
    所述宿主机基于所述进程的地址空间,获取所述进程对应的缓存区的地址。The host machine obtains the address of the cache area corresponding to the process based on the address space of the process.
  9. 根据权利要求7或8所述的方法,其特征在于,所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所属的缓存区的地址,包括:The method according to claim 7 or 8 is characterized in that, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs comprises:
    所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程的用户标识;When the host machine determines that the namespace where the process is located is the initial namespace, obtaining the user identifier of the process;
    所述宿主机在确定所述进程的用户标识为根用户的标识时,获取所述进程所属的缓存区的地址。When the host machine determines that the user identifier of the process is the identifier of the root user, it obtains the address of the cache area to which the process belongs.
  10. 根据权利要求9所述的方法,其特征在于,所述用户标识为用户标识符UID和用户组标识GID中的至少一个;所述根用户的标识为零。The method according to claim 9 is characterized in that the user identifier is at least one of a user identifier UID and a user group identifier GID; and the root user's identifier is zero.
  11. 根据权利要求7-10任一项所述的方法,其特征在于,所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所属的缓存区的地址,包括:The method according to any one of claims 7 to 10 is characterized in that, when the host machine determines that the namespace where the process is located is the initial namespace, obtaining the address of the cache area to which the process belongs comprises:
    所述宿主机在确定所述进程所在的命名空间为初始命名空间时,获取所述进程所在命名空间的层级;When the host machine determines that the namespace where the process is located is the initial namespace, obtaining the level of the namespace where the process is located;
    所述宿主机在确定所述层级为零时,获取所述进程所属的缓存区的地址。 When the host machine determines that the level is zero, it obtains the address of the cache area to which the process belongs.
  12. 根据权利要求7-11任一项所述的方法,其特征在于,在所述获取所述进程所属的缓存区的地址之前,所述方法还包括:The method according to any one of claims 7 to 11, characterized in that before obtaining the address of the cache area to which the process belongs, the method further comprises:
    所述宿主机获取进程的数据结构;所述数据结构包括所述进程的挂载点标识;The host machine obtains a data structure of a process; the data structure includes a mount point identifier of the process;
    所述宿主机在挂载点标识为根目录标识时,确定所述进程的挂载点为根目录。When the mount point identifier is a root directory identifier, the host machine determines that the mount point of the process is the root directory.
  13. 一种容器逃逸的检测方法,其特征在于,所述方法包括:A method for detecting container escape, characterized in that the method comprises:
    所述宿主机在确定所述进程的挂载点为根目录时,从所述进程的数据结构中获取目标数据;When the host machine determines that the mount point of the process is a root directory, acquiring target data from a data structure of the process;
    所述宿主机在所述目标数据符合预设条件时,提示告警信息,所述告警信息用于提示所述进程发生容器逃逸。When the target data meets a preset condition, the host machine prompts an alarm message, where the alarm message is used to prompt that a container escape occurs in the process.
  14. 一种电子设备,其特征在于,所述电子设备包括一个或多个处理器和一个或多个存储器;其中,所述一个或多个存储器与所述一个或多个处理器耦合,所述一个或多个存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,当所述一个或多个处理器执行所述计算机指令时,使得所述电子设备执行如权利要求1-13中任一项所述的方法。An electronic device, characterized in that the electronic device includes one or more processors and one or more memories; wherein the one or more memories are coupled to the one or more processors, and the one or more memories are used to store computer program codes, and the computer program codes include computer instructions, and when the one or more processors execute the computer instructions, the electronic device executes the method as described in any one of claims 1-13.
  15. 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在电子设备上运行时,使得所述电子设备执行如权利要求1-14中任一项所述的方法。 A computer-readable storage medium comprises instructions, wherein when the instructions are executed on an electronic device, the electronic device executes the method as claimed in any one of claims 1 to 14.
PCT/CN2023/121088 2022-09-29 2023-09-25 Container escape detection method, electronic device, and system WO2024067479A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211200843.7 2022-09-29
CN202211200843.7A CN117827362A (en) 2022-09-29 2022-09-29 Container escape detection method, electronic equipment and system

Publications (1)

Publication Number Publication Date
WO2024067479A1 true WO2024067479A1 (en) 2024-04-04

Family

ID=90476266

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/121088 WO2024067479A1 (en) 2022-09-29 2023-09-25 Container escape detection method, electronic device, and system

Country Status (2)

Country Link
CN (1) CN117827362A (en)
WO (1) WO2024067479A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200334362A1 (en) * 2019-04-22 2020-10-22 Cyberark Software Ltd. Securing privileged virtualized execution instances
CN111881453A (en) * 2020-07-20 2020-11-03 北京百度网讯科技有限公司 Container escape detection method and device and electronic equipment
CN114547594A (en) * 2022-01-24 2022-05-27 华北电力大学 Penetration attack detection method for intelligent Internet of things terminal container
CN114676424A (en) * 2022-05-25 2022-06-28 杭州默安科技有限公司 Container escape detection and blocking method, device, equipment and storage medium
CN114968494A (en) * 2022-06-23 2022-08-30 杭州默安科技有限公司 Container escape detection method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200334362A1 (en) * 2019-04-22 2020-10-22 Cyberark Software Ltd. Securing privileged virtualized execution instances
CN111881453A (en) * 2020-07-20 2020-11-03 北京百度网讯科技有限公司 Container escape detection method and device and electronic equipment
CN114547594A (en) * 2022-01-24 2022-05-27 华北电力大学 Penetration attack detection method for intelligent Internet of things terminal container
CN114676424A (en) * 2022-05-25 2022-06-28 杭州默安科技有限公司 Container escape detection and blocking method, device, equipment and storage medium
CN114968494A (en) * 2022-06-23 2022-08-30 杭州默安科技有限公司 Container escape detection method and system

Also Published As

Publication number Publication date
CN117827362A (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN110612512B (en) Protecting virtual execution environments
KR102255767B1 (en) Systems and methods for virtual machine auditing
US8176294B2 (en) Reducing storage expansion of a virtual machine operating system
EP3070604B1 (en) Method and apparatus for accessing physical resources
WO2020244369A1 (en) Inter-process communication method and apparatus, and computer device
EP3637288B1 (en) Method, apparatus and systems for accessing secure world
EP3365794B1 (en) Techniques for protecting memory pages of a virtual computing instance
US10489185B2 (en) Hypervisor-assisted approach for locating operating system data structures based on attribute matching
US11709931B2 (en) Shadow stack violation enforcement at module granularity
US20210311740A1 (en) Circular shadow stack in audit mode
CN112219202A (en) Memory allocation for guest operating systems
CN115362433A (en) Shadow stack enforcement range for dynamic code
WO2023165308A1 (en) Memory reclaim method and apparatus, and control device
WO2024067479A1 (en) Container escape detection method, electronic device, and system
CN108241801B (en) Method and device for processing system call
US11586727B2 (en) Systems and methods for preventing kernel stalling attacks
TW201201102A (en) Resource adjustment methods and systems for virtual machines, and computer program products thereof
US11301282B2 (en) Information protection method and apparatus
US11886900B1 (en) Unikernel hypervisor for managing multi-process applications using unikernel virtual machines
CN115248718A (en) Memory data acquisition method and device and storage medium
US20230401081A1 (en) Software isolation of virtual machine resources
CN113157299B (en) Resource allocation method and system
US12099862B2 (en) Hypervisor-assisted secured memory sharing among host and guest operating system
US11934857B2 (en) Supporting execution of a computer program by using a memory page of another computer program
US20220300314A1 (en) Hypervisor-assisted secured memory sharing among host and guest operating system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23870710

Country of ref document: EP

Kind code of ref document: A1