CN111597089A - Linux system call event acquisition and caching device and method - Google Patents

Linux system call event acquisition and caching device and method Download PDF

Info

Publication number
CN111597089A
CN111597089A CN202010420465.8A CN202010420465A CN111597089A CN 111597089 A CN111597089 A CN 111597089A CN 202010420465 A CN202010420465 A CN 202010420465A CN 111597089 A CN111597089 A CN 111597089A
Authority
CN
China
Prior art keywords
sysdig
system call
event information
probe
call event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010420465.8A
Other languages
Chinese (zh)
Other versions
CN111597089B (en
Inventor
吴建亮
胡鹏
王建荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Jeeseen Network Technologies Co Ltd
Original Assignee
Guangzhou Jeeseen Network Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jeeseen Network Technologies Co Ltd filed Critical Guangzhou Jeeseen Network Technologies Co Ltd
Priority to CN202010420465.8A priority Critical patent/CN111597089B/en
Publication of CN111597089A publication Critical patent/CN111597089A/en
Application granted granted Critical
Publication of CN111597089B publication Critical patent/CN111597089B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/302Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a software system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • G06F11/1451Management of the data involved in backup or backup restore by selection of backup contents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3072Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • G06F11/3086Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves the use of self describing data formats, i.e. metadata, markup languages, human readable formats
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management

Abstract

The invention provides a device and a method for acquiring and caching call events of a Linux system, belonging to the field of computer safety. The invention collects and stores the system calling event in real time, and realizes the online deployment of the Linux system calling event monitoring.

Description

Linux system call event acquisition and caching device and method
Technical Field
The invention relates to the field of computer security, in particular to a device and a method for acquiring and caching call events of a Linux system.
Background
Linux is a Unix-like operating system which is free to use and spread freely, and is an operating system which is multi-user, multi-task, multi-thread and multi-CPU (central processing unit) supporting. Linux is an open source software with stable system performance, and a core firewall component of the Linux has high efficiency and simple configuration, so that the Linux is more and more widely used in a plurality of enterprise networks. The Linux can be used as a server by network operation and maintenance personnel and can also be used as a network firewall.
The Linux kernel development and debugging or daily operation and maintenance analysis needs to monitor the Linux system behavior, which is a problem that development and operation and maintenance personnel pay special attention to. At present, the Linux system behavior monitoring is mainly based on a Linux system call (syscall) tracking point (tracepoint) technology to monitor most system calls, network connections, disk file operations, network read-write, process behaviors, shell command operations and the like. Such as the following commonly used tools: strace, ftrace, tcpdump, lsof, htop, iftop, systemTap, and other tools.
The existing Linux system behavior monitoring mainly adopts the following method:
1. adopts Linux kprobes debugging technology
The kprobes debugging technology is a lightweight kernel debugging technology designed by kernel developers specially for the convenience of tracking the execution state of kernel functions. With kprobes technology, kernel developers can dynamically insert probe points into most specified functions of the kernel to collect the needed debug state information, so that the developers know which system calls were called, when they were called, whether execution was correct, what the entries and returns of functions were, etc., and screen out or dump log files of these information.
2. Tracepoints technology for syscall using Linux kernel
And registering a hook according to tracepoint of syscall in the kernel to call a probe function of a user, recording related information in the probe function, and outputting or dumping a log file by using the information screen, thereby achieving the purpose of monitoring.
In chinese patent application document CN104008337B, it is disclosed that Hook is used to monitor the system call of Linux kernel, and when it is monitored that the system call set with Hook is called by a user mode process, it is determined whether the user mode process exists in a white list; when the user mode process exists in the white list, allowing the user mode process to call the system call; when the user mode process does not exist in the white list, forbidding the user mode process to call the system call; wherein the whitelist includes one or more user-mode processes that are allowed to perform system calls.
The prior art has at least the following disadvantages:
1. and opening the corresponding compiling item when the kernel is required to be compiled, and if the corresponding compiling item is not opened, the corresponding kernel cannot be normally used and needs to be recompiled.
2. The system is only suitable for debugging kernel developers or starting operation and maintenance personnel on site, and the characteristics of all tools are different, so that the requirement for comprehensive monitoring of the system cannot be met.
3. Does not provide good storage capability for behavioral data, and provides simple output or log storage. Due to the fact that the data caching function is not available, the behavior data are prone to losing, and afterwards data playback or analysis cannot be well supported.
4. The system cannot be deployed on line in real time, can be started only after or in the process of the operation, and cannot meet the automation requirements of operation and maintenance or safety monitoring. The load of server operation is increased on the high-throughput and high-concurrency server.
Disclosure of Invention
In order to solve the technical problems that the online real-time deployment cannot be realized, the performance is influenced and a data caching function is not available in the prior art, the invention provides a device and a method for acquiring and caching a call event of a Linux system. The invention develops a Linux system call event acquisition method and device based on an open-source sysdig project by deeply utilizing the existing open-source technology, develops a Linux system call event caching method and device based on an open-source Sqlite embedded database, converts comprehensive Linux system call into serialized event information, stores the system call event information into a caching library in real time, and provides comprehensive system behavior event information acquisition and real-time caching functions through the sysdig.
The invention provides a Linux system call event collection device, which comprises:
the system call event acquisition module is used for acquiring and processing system call event information and comprises a sysdig-probe kernel driving module;
the sysdig-probe kernel driving module is used for collecting and processing system calling event information, comprises a sysdig-probe filtering module and a sysdig-probe event serialization module, adopts a Linux character device driving technology, registers tracking points of all system calling events into a probe function in a driving form, and collects the system calling event information through a kernel mounted probe function when the system calling events occur;
the sysdig-probe filtering module is used for analyzing and filtering system calling event information, filtering specified system calling event related information according to a filtering rule, and classifying the system calling event information filtered by the sysdig-probe filtering module according to the file descriptor type in system calling;
the sysdig-probe event serialization module is used for serializing the system call event information, carrying out binary serialization on the system call event information according to the type of the system call event information and the input and output parameters, and filling the serialized system call event information into a memory shared by the sysdig-probe kernel driving module and the sysdig-userspace layer.
Preferably, the filtering rule of the sysdig-probe filtering module is: and filtering the system calling related information according to the specified system calling identity identification syscall ID, the process name and the Linux system process ID.
Preferably, the file descriptor types include: the system comprises a process, a file and a socket, wherein the input and output parameters comprise: a source IP address, a source port, a destination IP address, and a destination port.
Preferably, the sysdig-probes kernel driver module in the device is configured to:
1) collecting system calling information in real time:
the method adopts the tracepoints technology of the Linux kernel, and the tracepoints of the kernel is a lightweight hook technology, can be used for efficient system call behavior tracking, has extremely small influence on the performance of the system, and has only small time loss and space loss according to the official statement of Linux. By registering customized probe functions such as syscall _ enter _ probe and the like, the kernel finds the probe functions after the system call occurs, and transmits the relevant input and output parameter information to the probe functions, so that the real-time capture and acquisition of the system call information are realized.
2) And (3) filtering system call information:
and filtering related system calling information through three filtering rules of a specified system calling identity identification syscall ID, a specified process ID, a process name and a Linux system process ID, thereby reducing the generation of system calling event information.
3) Binary serialization of system call information:
according to the file descriptor and the input/output parameter information of the system call, a set of memory block structures corresponding to the Linux system call interface are redefined, and the Linux memory operation function is used for directly writing the input/output parameter information of the system call into a shared memory of the sysdig-probe kernel drive module, so that the function of binary serialization system call information is achieved.
The invention provides a Linux system call event caching device, which comprises:
the system call event caching module comprises a sysdig-userpace layer, and the sysdig-userpace layer comprises a Sqlite file database;
the system call event caching module is used for caching system call event information, the sysdig-userpace layer is used for reading the system call event information acquired by the acquisition device, processing the system call event information and storing the processed system call event information into a Sqlite file database, the sysdig-userpace layer reads the system call event information of a binary system from a memory shared with the sysdig-probe kernel driving module, classifies the binary system call event information according to the file descriptor type in the system call, and serializes the system call event information of the binary system according to the type of the binary system call event information and distributes the serialized system call event information to the Lua service processing script; classifying and filtering the system call event information by the Lua service processing script according to specific service requirements, and finally storing the filtered system call event information into a Sqlite file database.
The Sqlite file database is a cache module in a sysdig-userpace layer, the cache module adopts an open-source embedded Sqlite database system, and the Sqlite is packaged into a real-time memory type sysdig-userpace layer cache module which does not need to be installed and deployed through a special tuning technology of the Sqlite.
Preferably, the Sqlite file database outputs the system call event via HTTP REST API.
Preferably, the system call event caching module in the apparatus is configured to:
1) serialization of system call information:
and classifying the system call event information according to processes, files, socket types and the like by adopting a standard json standard technology, and serializing input and output information of system call into json character strings through a json library built in a sysdig _ userpace layer to form the system call event information convenient for filtering or processing the Lua service processing script.
2) Backup of files in a cache database:
the open source Sqlite database does not directly provide database file backup management functions. In order to provide functions of system call event information query, retrieval analysis, storage and the like, the cache database file backup function adopts a mechanism of combining database file number limitation and database file size limitation to meet the requirements of quick retrieval and storage capacity management.
The method specifically comprises the following steps:
when the sysdig _ userpace layer is started, whether the size of the current database file exceeds the limit or not is judged, if yes, a new database file is generated, when the new database file is generated, whether the limit of the number of the database files is exceeded or not is judged, and if yes, the database file which is established earliest is covered. This is repeated.
And when the database file is written in, judging whether the size of the current database file exceeds the limit, and if so, processing according to the step 1.
The invention provides a Linux system call event collection method using the Linux system call event collection device, which comprises the following steps:
s51: automatically mounting a sysdig-probe kernel driving module through a shell script;
s52: starting a sysdig-userpace layer to complete the mapping of the sysdig-probe kernel driving module and the shared memory of the sysdig-userpace layer;
s53: the Linux kernel layer receives a system calling event, calls the sysdig-probe filtering module in the sysdig-probe kernel driving module through a registered tracking point, filters the system calling event, and classifies the system calling event information filtered by the sysdig-probe filtering module according to the type of a file descriptor in system calling;
s54: calling a sysdig-probe event serialization module in the sysdig-probe kernel driving module, and generating binary serialized system call event information according to the type of the system call event information and the input and output parameters;
s55: the binary serialized system call event information described in step S54 is written into the shared memory of the sysdig-probe kernel driver module and the sysdig-userpace layer.
Preferably, the system api function registers the tracking point of the syscall _ enter and the syscall _ exit system calling the probe function through the tracepoint _ probe _ register provided by the Linux kernel in step S53.
The invention provides a Linux system call event caching method using the Linux system call event caching device, which comprises the following steps of:
s81: the Sysdig-userspace layer reads the binary system call event information from the memory shared with the Sysdig-probe and loads the Lua service processing script;
s82: the Sysdig-userpace layer classifies the system call event information according to the file descriptor type in the system call, serializes the system call event information, and the serialized format comprises a standard json format;
s83: filtering serialized system call event information by a service processing script of a Sysdig-userpace layer according to related information, wherein the related information comprises: a process name, a file name, a source IP address and a destination IP address;
s84: the Sysdig-userpace layer distributes the system calling event information to the service processing script, and the service processing script stores the filtered system calling event information into the Sqlite file database.
Preferably, in step S83, the related information includes: a process name, a file name, a source IP address, and a destination IP address, and in step S82, the serialized format includes a standard json format.
Preferably, in step S84, after the system call event information is distributed to the service processing script, the service processing script on _ event function is called to perform filtering, or the Lua script binding interface of the Sqlite c api is called, and then the system call event information is stored in the Sqlite file database.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention carries out secondary development on the open-source sysdig, adds the sysdig-probe filtering module and the sysdig-probe event serialization module, utilizes the sysdig-probe non-intrusive mounting mode, has little influence on the Linux kernel, can monitor the system calling events including various system calls (syscall), process creation, network connection, network IO, file creation, file IO, shell operation, database operation (MySQL, oracle), telnet operation, http access and the like without recompiling the kernel, and realizes the online deployment, real-time and comprehensive monitoring of the Linux system calling events.
(2) The invention adopts the Linux memory mapping technology on the sysdig-probe kernel driving module and the sysdig-userspace layer, thereby reducing the memory copy operation and improving the operation efficiency.
(3) According to the invention, by fully utilizing a Sqlite MEMORY database mode, the problem of real-time storage of system call event information is reduced by files IO, and by optimizing Sqlite PRAGMA cache _ size 8000, PRAGMA synchronization OFF, PRAGMA journal _ mode MEMORY, PRAGMA temp _ store MeMORY and PRAGMA items, real-time warehousing of event information is realized, and the IO expense of the system files is reduced for retrieval and analysis after events.
(4) According to the invention, when the system is deployed on a high-load server line, the Sqlite database caching technology is adopted, so that the problem of system call event information packet loss under the high-load condition is solved, and the real-time monitoring of the system call event information packet loss and the system behavior is reduced.
Drawings
FIG. 1 is a schematic diagram of a system event collection flow of the present invention;
FIG. 2 is a schematic diagram of an event caching process of the system of the present invention;
FIG. 3 is a schematic diagram of the system event collection and buffering device of the present invention.
Detailed Description
The following detailed description of the embodiments of the present invention is provided in conjunction with the accompanying drawings of fig. 1-3.
The invention provides a Linux system call event acquisition device, which comprises:
the system call event acquisition module is used for acquiring and processing system call event information and comprises a sysdig-probe kernel driving module;
the sysdig-probe kernel driving module is used for collecting and processing system calling event information, comprises a sysdig-probe filtering module and a sysdig-probe event serialization module, is based on open-source sysdig for secondary development, adopts a Linux character device driving technology, registers tracking points of all system calling events into a probe function in a driving form, and collects the system calling event information by mounting the probe function through a kernel when the system calling events occur;
the sysdig-probe filtering module is used for analyzing and filtering system calling event information, filtering specified system calling event related information according to a filtering rule, and classifying the system calling event information filtered by the sysdig-probe filtering module according to the file descriptor type in system calling;
the sysdig-probe event serialization module is used for serializing the system call event information, carrying out binary serialization on the system call event information according to the type of the system call event information and the input and output parameters, and filling the serialized system call event information into a memory shared by the sysdig-probe kernel driving module and the sysdig-userspace layer.
As a preferred embodiment, the filtering rule of the sysdig-probe filtering module is: and filtering the system calling related information according to the specified system calling identity identification syscall ID, the process name and the Linux system process ID.
As a preferred embodiment, the file descriptor types include: the system comprises a process, a file and a socket, wherein the input and output parameters comprise: a source IP address, a source port, a destination IP address, and a destination port.
As a preferred embodiment, the sysdig-probes kernel driver module in the device is configured to:
1) collecting system calling information in real time:
the method adopts the tracepoints technology of the Linux kernel, and the tracepoints of the kernel is a lightweight hook technology, can be used for efficient system call behavior tracking, has extremely small influence on the performance of the system, and has only small time loss and space loss according to the official statement of Linux. By registering customized probe functions such as syscall _ enter _ probe and the like, the kernel finds the probe functions after the system call occurs, and transmits the relevant input and output parameter information to the probe functions, so that the real-time capture and acquisition of the system call information are realized.
2) And (3) filtering system call information:
and filtering related system calling information through three filtering rules of a specified system calling identity identification syscall ID, a specified process ID, a process name and a Linux system process ID, thereby reducing the generation of system calling event information.
3) Binary serialization of system call information:
according to the file descriptor and the input/output parameter information of the system call, a set of memory block structures corresponding to the Linux system call interface are redefined, and the Linux memory operation function is used for directly writing the input/output parameter information of the system call into a shared memory of the sysdig-probe kernel drive module, so that the function of binary serialization system call information is achieved.
The invention provides a Linux system call event caching device, which comprises:
the system call event caching module comprises a sysdig-userpace layer, and the sysdig-userpace layer comprises a Sqlite file database;
the system call event caching module is used for caching system call event information, the sysdig-userpace layer is used for reading the system call event information acquired by the acquisition device, processing the system call event information and storing the processed system call event information into a Sqlite file database, the sysdig-userpace layer reads the system call event information of a binary system from a memory shared with the sysdig-probe kernel driving module, classifies the binary system call event information according to the file descriptor type in the system call, and serializes the system call event information of the binary system according to the type of the binary system call event information and distributes the serialized system call event information to the Lua service processing script; classifying and filtering the system call event information by the Lua service processing script according to specific service requirements, and finally storing the filtered system call event information into a Sqlite file database.
The Sqlite file database is a cache module in a sysdig-userpace layer, the cache module adopts an open-source embedded Sqlite database system, and the Sqlite is packaged into a real-time memory type sysdig-userpace layer cache module which does not need to be installed and deployed through a special tuning technology of the Sqlite.
As a preferred embodiment, the Sqlite file database outputs system call events via HTTP REST API.
As a preferred embodiment, the system call event caching module in the apparatus is configured to:
1) serialization of system call information:
and classifying the system call event information according to processes, files, socket types and the like by adopting a standard json standard technology, and serializing input and output information of system call into json character strings through a json library built in a sysdig _ userpace layer to form the system call event information convenient for filtering or processing the Lua service processing script.
2) Backup of files in a cache database:
the open source Sqlite database does not directly provide database file backup management functions. In order to provide functions of system call event information query, retrieval analysis, storage and the like, the cache database file backup function adopts a mechanism of combining database file number limitation and database file size limitation to meet the requirements of quick retrieval and storage capacity management.
The method specifically comprises the following steps:
when the sysdig _ userpace layer is started, whether the size of the current database file exceeds the limit or not is judged, if yes, a new database file is generated, when the new database file is generated, whether the limit of the number of the database files is exceeded or not is judged, and if yes, the database file which is established earliest is covered. This is repeated.
And when the database file is written in, judging whether the size of the current database file exceeds the limit, and if so, processing according to the step 1.
The invention provides a Linux system call event collection method using the Linux system call event collection device, which comprises the following steps:
s51: automatically mounting a sysdig-probe kernel driving module through a shell script;
s52: starting a sysdig-userpace layer to complete the mapping of the sysdig-probe kernel driving module and the shared memory of the sysdig-userpace layer;
s53: the Linux kernel layer receives a system calling event, calls the sysdig-probe filtering module in the sysdig-probe kernel driving module through a registered tracking point, filters the system calling event, and classifies the system calling event information filtered by the sysdig-probe filtering module according to the type of a file descriptor in system calling;
s54: calling a sysdig-probe event serialization module in the sysdig-probe kernel driving module, and generating binary serialized system call event information according to the type of the system call event information and the input and output parameters;
s55: the binary serialized system call event information described in step S54 is written into the shared memory of the sysdig-probe kernel driver module and the sysdig-userpace layer.
As a preferred embodiment, the system api function registers the syscall _ enter and the tracking point of the syscall _ exit system call probe function through the tracepoint _ probe _ register provided by the Linux kernel in step S53.
The invention provides a Linux system call event caching method using the Linux system call event caching device, which comprises the following steps of:
s81: the Sysdig-userspace layer reads the binary system call event information from the memory shared with the Sysdig-probe and loads the Lua service processing script;
s82: the Sysdig-userpace layer classifies the system call event information according to the file descriptor type in the system call, serializes the system call event information, and the serialized format comprises a standard json format;
s83: filtering serialized system call event information by a service processing script of a Sysdig-userpace layer according to related information, wherein the related information comprises: a process name, a file name, a source IP address and a destination IP address;
s84: the Sysdig-userpace layer distributes the system calling event information to the service processing script, and the service processing script stores the filtered system calling event information into the Sqlite file database.
In a preferred embodiment, in step S83, the related information includes: a process name, a file name, a source IP address, and a destination IP address, and in step S82, the serialized format includes a standard json format.
As a preferred embodiment, in step S84, after the system call event information is distributed to the service processing script, the service processing script on _ event function is called to perform filtering, or the Lua script binding interface of the Sqlite c api is called, and then the system call event information is stored in the Sqlite file database.
Example 1
The invention provides a Linux system call event collecting device, which is described by taking the example of collecting an accept system call event (in a Linux system, the system call event is sometimes called a system event in practice, sometimes called a system call, sometimes also called a call or an event in short, and all the system call events are actually called as system call events), and comprises the following components:
and the system call event acquisition module is used for acquiring and processing system call event information and comprises a sysdig-probe kernel driving module.
The sysdig-probe kernel driving module is used for collecting system calling event information, comprises a sysdig-probe filtering module and a sysdig-probe event serialization module, is based on open source sysdig for secondary development, adopts Linux character device driving technology, registers tracking points of all system calling events into probe functions in a driving form, and collects the accept system calling event information by mounting the probe functions through a kernel when the accept system calling event occurs;
the sysdig-probe filtering module is used for analyzing and filtering the system call event information, and filtering out the related information of the specified system call event according to a filtering rule, wherein the filtering rule is as follows: filtering system call related information according to an accept system call identity syscall ID, a process name and a Linux system process ID, wherein a file descriptor in the accept system call is a socket, the sysdig-probe filtering module classifies system call event information filtered by the sysdig-probe filtering module according to a file descriptor type socket in the accept system call, and the file descriptor types comprise: processes, files, and sockets;
the sysdig-probe event serialization module is used for serializing system call event information, and performing two-step serialization on the system call event information according to the type of the system call event information and input and output parameters, wherein the input and output parameters comprise: and filling the serialized system call event information into a memory shared by the sysdig-probe kernel driving module and the sysdig-userpace layer by using the source IP address, the source port, the destination IP address and the destination port.
The sysdig-probes kernel driving module in the acquisition device is used for:
1) collecting system calling information in real time:
the method adopts the tracepoints technology of the Linux kernel, and the tracepoints of the kernel is a lightweight hook technology, can be used for efficient system call behavior tracking, has extremely small influence on the performance of the system, and has only small time loss and space loss according to the official statement of Linux. By registering a probe function customized by syscall _ enter _ probe and the like, finding the probe function in a kernel after a system call occurs, and transmitting relevant input and output parameter information to the probe function, wherein the input and output parameters comprise: the system comprises a source IP address, a source port, a destination IP address and a destination port, thereby realizing the real-time capture and acquisition of system call information.
2) And (3) filtering system call information:
and filtering related system calling information through three filtering rules of a specified system calling identity identification syscall ID, a specified process ID, a process name and a Linux system process ID, thereby reducing the generation of system calling event information.
3) Binary serialization of system call information:
according to a file descriptor socket called by an accept system, an input/output parameter information source IP address, a source port, a destination IP address and a destination port, a set of memory block structures corresponding to a Linux system calling interface are redefined, and the Linux memory operation function is used for directly writing the system calling input/output parameter information into a shared memory of a sysdig-probe kernel driving module, so that the function of binary serialized system calling information is achieved.
The invention provides a method for acquiring Linux system call events by using the acquisition device, which takes accept system call as an example and comprises the following steps:
s51: automatically mounting a sysdig-probe kernel driving module through a system/etc/init.d/service script;
in the acquisition device, the sysdig-probe kernel driving module comprises a sysdig-probe filtering module and a sysdig-probe event serialization module, and is developed for the second time on the basis of the original sysdig-probe kernel driving module, so that the functions of filtering and event serialization are realized.
By adopting the advanced open-source sysdig system behavior monitoring technology, the invention realizes comprehensive Linux system call event monitoring and provides a real-time Linux system call event information acquisition scheme which can be deployed on line. By using the sysdig-probe non-invasive mounting mode, the influence on the Linux kernel is small, and the kernel does not need to be recompiled.
S52: starting a sysdig-userpace layer to complete the mapping of the sysdig-probe kernel driving module and the sysdig-userpace layer shared memory in the acquisition device; the sysdig-probe kernel driving module and the sysdig-userspace layer adopt the Linux memory mapping technology, so that the memory copy operation is reduced, and the operation efficiency is improved
S53: the user is connected with the host through the ssh terminal, and the Linux system of the host generates corresponding syscall system call, such as accept system call. The Linux kernel layer receives an accept system call event through the bash, calls the syscall-probe filter module in the syscall-probe kernel driving module in the acquisition device through the registered tracking points (namely, the tracking points of the tracepoint _ probe _ register, the syscall _ entry and the syscall _ exit system call probe functions provided by the Linux2.6 kernel according to a Linux tracepoint mechanism), finds the syscall-probe filter module in the syscall-probe kernel driving module, inserts the syscall-probe function into the kernel, and transmits the related information of the accept system call event into the probe function, such as quadruplet information (source IP address, source port, destination IP address and destination port) of the socket, filters the system call event, filters the information of the system call event which does not need to be processed, and classifies the system call event filtered information after the syscall-probe filter module according to the type of the file descriptor in the system call.
S54: calling the sysdig-probe event serialization module in the sysdig-probe kernel driving module in the acquisition device, and generating binary serialized system call event information according to the type of the system call event information and the input and output parameters;
the current system call is an accept system call, the file descriptor is of a socket type, and the input and output parameters generate binary serialized system call event information according to the information by utilizing parameters such as a source IP address, a source port, a destination IP address, a destination port and the like.
S55: the binary serialized accept system call event information in step S54, such as the quadruple information of socket: and writing the source IP address, the source port, the destination IP address and the destination port into a ring buffer memory shared by the sysdig-probe kernel driving module and the sysdig-user space layer.
Example 2
The invention provides a Linux system call event caching device, which takes caching an accept system call event as an example for explanation, and the device comprises:
the system call event caching module comprises a sysdig-userpace layer, and the sysdig-userpace layer comprises a Sqlite file database;
the system call event caching module is used for caching system call event information, the sysdig-userpace layer is used for reading the accept system call event information acquired by the acquisition device, processing the system call event information and storing the processed system call event information into an Sqlite file database, the sysdig-userpace layer reads the accept system call event information of a binary system acquired by the acquisition device from a memory shared with the sysdig-probe kernel driving module, classifying the binary system call event information according to a file descriptor type socket in the accept system call, and serializing and distributing the accept system call event information of the binary system into a Lua service processing script according to the type socket; classifying and filtering the system call event information by the Lua service processing script according to specific service requirements, and finally storing the filtered system call event information into a Sqlite file database.
The Sqlite file database is a cache module in a sysdig-userpace layer, the cache module adopts an open-source embedded Sqlite database system, and the Sqlite is packaged into a real-time memory type sysdig-userpace layer cache module which does not need to be installed and deployed through a special tuning technology of the Sqlite.
The data cached by the Sqlite file database outputs the accept system call event information through HTTP REST API.
The system call event caching module in the caching device is used for:
1) serialization of system call information:
the method comprises the steps of classifying system call event information such as accept according to processes, files, socket types and the like by adopting a standard json specification technology, wherein the accept system call is of the socket type, and serializing input and output information such as a source IP address, a source port, a destination IP address and a destination port of the accept system call into a json character string through a json library built in a sysdig _ userpace layer to form the system call event information convenient for filtering or processing a Lua service processing script.
2) Backup of files in a cache database:
the open source Sqlite database does not directly provide database file backup management functions. The invention provides functions of system call event information query, retrieval analysis, storage and the like, and a mechanism of database file number limitation and database file size limitation is adopted for a cache database file backup function to meet the requirements of quick retrieval and storage capacity management.
The method specifically comprises the following steps:
when the sysdig _ userpace layer is started, whether the size of the current database file exceeds the limit or not is judged, if yes, a new database file is generated, when the new database file is generated, whether the limit of the number of the database files is exceeded or not is judged, and if yes, the database file which is established earliest is covered. And (6) circulating.
And when the database file is written in, judging whether the size of the current database file exceeds the limit, and if so, processing according to the step 1.
The invention provides a Linux system call event caching method using the caching device, which comprises the following steps:
s81: a Sysdig-userpace layer in the cache device reads binary serialized accept system call event information such as a source IP address, a source port, a destination IP address and a destination port in real time from a ring buffer memory shared by the Sysdig-probe kernel driving module, and loads a Lua service processing script;
s82: analyzing the read accept system call event information by a Sysdig-userpace layer in the cache device, finding that the accept system call is accepted according to syscall ID of the system call, and serializing the system call event information into a standard json format character string which can be processed by a Lua service processing script through a built-in json character string serialization function according to the accept system call information, such as four-tuple information of socket;
s83: filtering serialized accept system call event information according to relevant information such as a process name, a file name, a source IP address, a destination IP address and the like by a Lua service processing script of a Sysdig-userspace layer in the cache device;
s84: the Sysdig-userpace layer in the cache device pushes system call event information in a json character format through a built-in LuaJIT engine according to a call mechanism of c/c + + and Lua scripts, distributes the system call event information to the Lua service processing scripts, classifies and filters the Lua service processing scripts according to specific service requirements, and stores the filtered accept system call event information into a Sqlite file database through a Lua bind script of a Sqlite API by the Lua service processing scripts.
In the cache device, the Sqlite file database is packaged into a real-time memory type sysdig-userspace layer cache without installation and deployment through the special optimization technology of PRAGMA cache _ size, PRAGMA syanc, PRAGMAjoural _ mode and PRAGMAtemp _ store of Sqlite.
Example 3
The invention provides a Linux system call event acquisition device and a Linux system call event caching device, wherein the acquisition device comprises a sysdig-probe kernel driving module, the Linux system call event caching device comprises a sysdig-userpace layer, and the sysdig-probe kernel driving module in the acquisition device and the sysdig-userpace layer in the caching device adopt a Linux memory mapping technology to share a memory. And a sysdig-user space layer in the cache device reads the system call event information such as the accept collected by the collection device from the shared memory and stores the processed information in a Sqlite file database.
The invention is based on the technology deep secondary development technology of the open-source sysdig and Sqlite embedded database, utilizes the non-invasive mounting mode of the sysdig-probe, has little influence on the Linux kernel, and does not need to recompile the kernel. Because the sysdig-probe kernel driving module and the sysdig-userspace layer adopt the Linux memory mapping technology, the memory copy operation is reduced, and the operation efficiency is improved. Therefore, the current situation that the existing tool is only suitable for kernel development and debugging or operation and maintenance monitoring is changed, online deployment can be realized, and system call events can be monitored and analyzed in real time.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (10)

1. A Linux system call event collection device, the device comprising:
the system call event acquisition module is used for acquiring and processing system call event information and comprises a sysdig-probe kernel driving module;
the sysdig-probe kernel driving module is used for collecting and processing system calling event information and comprises a sysdig-probe filtering module and a sysdig-probe event serialization module;
the sysdig-probe filtering module is used for analyzing and filtering system calling event information, filtering specified system calling event related information according to a filtering rule, and classifying the system calling event information filtered by the sysdig-probe filtering module according to the file descriptor type in system calling;
the sysdig-probe event serialization module is used for serializing the system call event information, carrying out binary serialization on the system call event information according to the type of the system call event information and the input and output parameters, and filling the serialized system call event information into a memory shared by the sysdig-probe kernel driving module and the sysdig-userspace layer.
2. The apparatus of claim 1, wherein the filtering rule of the sysdig-probe filtering module is: and filtering the system calling related information according to the specified system calling identity identification syscall ID, the process name and the Linux system process ID.
3. The apparatus of claim 1, wherein the file descriptor types comprise: the system comprises a process, a file and a socket, wherein the input and output parameters comprise: a source IP address, a source port, a destination IP address, and a destination port.
4. A Linux system call event caching apparatus, comprising: the system call event caching module comprises a sysdig-userpace layer, and the sysdig-userpace layer comprises a Sqlite file database;
the sysdig-userpace layer is used for reading the system calling event information acquired by the acquisition device according to any one of claims 1 to 3, processing the system calling event information and storing the processed system calling event information into a Sqlite file database, reading the system calling event information of a binary system from a memory shared with the sysdig-probe kernel driving module, classifying the binary system calling event information according to the file descriptor type in the system calling, and then serializing the system calling event information of the binary system according to the type of the binary system calling event information and distributing the serialized system calling event information into the Lua service processing script; classifying and filtering the system call event information by the Lua service processing script according to specific service requirements, and finally storing the filtered system call event information into a Sqlite file database.
5. The apparatus of claim 4, wherein the Sqlite file database outputs the system call event via HTTP REST API.
6. A Linux system call event collection method using a collection device according to any one of claims 1-3, comprising the steps of:
s51: automatically mounting a sysdig-probe kernel driving module through a shell script;
s52: starting a sysdig-userpace layer to complete the mapping of the sysdig-probe kernel driving module and the shared memory of the sysdig-userpace layer;
s53: the Linux kernel layer receives a system calling event, calls the sysdig-probe filtering module in the sysdig-probe kernel driving module through a registered tracking point, filters the system calling event, and classifies the system calling event information filtered by the sysdig-probe filtering module according to the type of a file descriptor in system calling;
s54: calling a sysdig-probe event serialization module in the sysdig-probe kernel driving module, and generating binary serialized system call event information according to the type of the system call event information and the input and output parameters;
s55: the binary serialized system call event information described in step S54 is written into the shared memory of the sysdig-probe kernel driver module and the sysdig-userpace layer.
7. The method according to claim 6, wherein the tracepoint _ probe _ register provided by the Linux kernel in step S53, the system api function registers the syscall _ enter and the tracking point of the syscall _ exit system call probe function.
8. A Linux system call event caching method using the caching apparatus of claim 4, comprising the steps of:
s81: the Sysdig-userpace layer reads binary serialized system call event information from a memory shared with the Sysdig-probe and loads a Lua service processing script;
s82: the Sysdig-userpace layer classifies the system call event information according to the file descriptor type in the system call and serializes the system call event information;
s83: filtering serialized system call event information according to relevant information by using a service processing script of a Sysdig-userpace layer;
s84: the Sysdig-userpace layer distributes the system calling event information to the service processing script, and the service processing script stores the filtered system calling event information into the Sqlite file database.
9. The method according to claim 8, wherein in step S83, the related information includes: a process name, a file name, a source IP address, and a destination IP address, and in step S82, the serialized format includes a standard json format.
10. The method according to claim 8, wherein in step S84, after the system call event information is distributed to the service processing script, the service processing script on _ event function is called to perform filtering, or the Lua script binding interface of Sqlite c api is called, and then the system call event information is stored in the Sqlite file database.
CN202010420465.8A 2020-05-18 2020-05-18 Linux system call event acquisition and caching device and method Active CN111597089B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010420465.8A CN111597089B (en) 2020-05-18 2020-05-18 Linux system call event acquisition and caching device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010420465.8A CN111597089B (en) 2020-05-18 2020-05-18 Linux system call event acquisition and caching device and method

Publications (2)

Publication Number Publication Date
CN111597089A true CN111597089A (en) 2020-08-28
CN111597089B CN111597089B (en) 2020-12-18

Family

ID=72187288

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010420465.8A Active CN111597089B (en) 2020-05-18 2020-05-18 Linux system call event acquisition and caching device and method

Country Status (1)

Country Link
CN (1) CN111597089B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051166A (en) * 2021-03-30 2021-06-29 上海商汤科技开发有限公司 System tracking method, device, equipment and storage medium
CN113821439A (en) * 2021-09-23 2021-12-21 成都欧珀通信科技有限公司 Method, device, storage medium and terminal for registering function to probe point
CN115840938A (en) * 2023-02-21 2023-03-24 山东捷讯通信技术有限公司 File monitoring method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631572A (en) * 2012-08-24 2014-03-12 曙光信息产业(北京)有限公司 Centralized event processing system and processing method
CN107491355A (en) * 2017-08-17 2017-12-19 山东浪潮商用系统有限公司 Funcall method and device between a kind of process based on shared drive
CN108171050A (en) * 2017-12-29 2018-06-15 浙江大学 The fine granularity sandbox strategy method for digging of linux container
WO2018224243A1 (en) * 2017-06-08 2018-12-13 British Telecommunications Public Limited Company Containerised programming
CN110321130A (en) * 2019-06-24 2019-10-11 大连理工大学 The not reproducible compiling localization method of log is called based on system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103631572A (en) * 2012-08-24 2014-03-12 曙光信息产业(北京)有限公司 Centralized event processing system and processing method
WO2018224243A1 (en) * 2017-06-08 2018-12-13 British Telecommunications Public Limited Company Containerised programming
CN107491355A (en) * 2017-08-17 2017-12-19 山东浪潮商用系统有限公司 Funcall method and device between a kind of process based on shared drive
CN108171050A (en) * 2017-12-29 2018-06-15 浙江大学 The fine granularity sandbox strategy method for digging of linux container
CN110321130A (en) * 2019-06-24 2019-10-11 大连理工大学 The not reproducible compiling localization method of log is called based on system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
WEIXIN_33955681: "《用Sysdig监控服务器和Docker容器》", 《HTTPS://BLOG.CSDN.NET/WEIXIN_33955681/ARTICLE/DETAILS/89825610》 *
阿基米德来了: "《Sysdig工作原理》", 《HTTPS://BLOG.CSDN.NET/M0_37140813/ARTICLE/DETAILS/105113942》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113051166A (en) * 2021-03-30 2021-06-29 上海商汤科技开发有限公司 System tracking method, device, equipment and storage medium
CN113821439A (en) * 2021-09-23 2021-12-21 成都欧珀通信科技有限公司 Method, device, storage medium and terminal for registering function to probe point
CN115840938A (en) * 2023-02-21 2023-03-24 山东捷讯通信技术有限公司 File monitoring method and device

Also Published As

Publication number Publication date
CN111597089B (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN111597089B (en) Linux system call event acquisition and caching device and method
CN109284269B (en) Abnormal log analysis method and device, storage medium and server
Rabkin et al. Static extraction of program configuration options
US6662362B1 (en) Method and system for improving performance of applications that employ a cross-language interface
US8601469B2 (en) Method and system for customizing allocation statistics
US7275241B2 (en) Dynamic instrumentation for a mixed mode virtual machine
Ghoshal et al. Provenance from log files: a BigData problem
CN110213207B (en) Network security defense method and equipment based on log analysis
CN101354675B (en) Method for detecting embedded software dynamic memory
CN112181833A (en) Intelligent fuzzy test method, device and system
EP3274839B1 (en) Technologies for root cause identification of use-after-free memory corruption bugs
CN115576649A (en) Container operation safety detection method based on behavior monitoring
CN109857520B (en) Semantic reconstruction improvement method and system in virtual machine introspection
CN108089978A (en) A kind of diagnostic method for analyzing ASP.NET application software performance and failure
Joukov et al. Application-storage discovery
CN109189652A (en) A kind of acquisition method and system of close network terminal behavior data
CN113176926A (en) API dynamic monitoring method and system based on virtual machine introspection technology
CN116361111A (en) Data acquisition method and device and electronic equipment
CN115617612A (en) Log reporting method and device, computer equipment and storage medium
CN115544518A (en) Vulnerability scanning engine implementation method and device, vulnerability scanning method and electronic equipment
CN114817300A (en) Log query method based on SQL (structured query language) statements and application thereof
KR102122968B1 (en) System and method for analyzing of application installation information
CN116501596A (en) Application program testing method and device
Paruchuri et al. Gaslight revisited: Efficient and powerful fuzzing of digital forensics tools
US20060294041A1 (en) Installing a component to an application server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant