CN114896588B - Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment - Google Patents

Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment Download PDF

Info

Publication number
CN114896588B
CN114896588B CN202210357958.0A CN202210357958A CN114896588B CN 114896588 B CN114896588 B CN 114896588B CN 202210357958 A CN202210357958 A CN 202210357958A CN 114896588 B CN114896588 B CN 114896588B
Authority
CN
China
Prior art keywords
user
file
command
type
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210357958.0A
Other languages
Chinese (zh)
Other versions
CN114896588A (en
Inventor
邓博仁
汪来富
邵壮丰
孙福兴
刘东鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN202210357958.0A priority Critical patent/CN114896588B/en
Publication of CN114896588A publication Critical patent/CN114896588A/en
Application granted granted Critical
Publication of CN114896588B publication Critical patent/CN114896588B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the disclosure provides a method, a device, a storage medium and electronic equipment for detecting abnormal behaviors of a host user, relates to the technical field of network and information security, and aims to at least partially solve the technical problem of hysteresis in detecting unknown abnormal behaviors of the host in the related technology. The related method for detecting the abnormal behavior of the host user comprises the following steps: acquiring process data of a host to be detected; extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features; and inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model. According to the embodiment of the disclosure, the unknown abnormal behavior in the host can be detected according to the user operation characteristics, so that the real-time performance of host abnormality detection is improved.

Description

Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment
Technical Field
The disclosure relates to the technical field of network and information security, in particular to a method and a device for detecting abnormal behavior of a host user, a storage medium and electronic equipment.
Background
At present, the conventional host anomaly detection technology mainly comprises two types of network-based intrusion detection and host-based intrusion detection, and the related technology is generally based on a statistical or rule matching technology to perform known anomaly behaviors, so that the problem of hysteresis in the detection of the unknown anomaly behaviors exists.
Disclosure of Invention
The embodiment of the disclosure provides a method, a device, a storage medium and electronic equipment for detecting abnormal behaviors of a host user, which are used for at least partially solving the technical problem that hysteresis exists in the detection of unknown abnormal behaviors of the host in the related technology.
According to a first aspect of the present disclosure, there is provided a method for detecting abnormal behavior of a host user, including: acquiring process data of a host to be detected; extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features; and inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model.
Optionally, the process data includes: at least one of process execution program information, process execution user information and file information operated by a process of a host to be detected.
Optionally, the user characteristics include a user group to which the user belongs.
Optionally, the command features include at least one of: the type of command, the character length of the command, the number of parameters in the command, and the special character duty cycle in the command.
Optionally, the file features include at least one of: file type, user group to which the file belongs, file path hierarchy, file path information character length, and special character duty ratio in file name.
Optionally, extracting the user operation feature based on the process data to obtain a predicted data set, including: extracting user information of the process data, and determining user group information of the process data according to a user group mapping table, wherein the user group mapping table comprises a corresponding relation between the user information and a user group; and/or extracting a command name of a process execution command of the process data, determining a command type of the process data according to a command type mapping table, and calculating the character length of the command, the number of parameters in the command and the special character duty ratio, wherein the command type mapping table comprises a corresponding relation between the command name and the command type; and/or extracting information of an associated file of the process data, determining a file type of the associated file according to a file type mapping table, extracting user information of the associated file, determining a user group of the associated file according to a user group mapping table, and calculating a file path level of the associated file, a character length of the file path information and a special character duty ratio in a file name, wherein the file type mapping table comprises a corresponding relation between the information of the associated file and the file type.
Optionally, the command type mapping table includes a correspondence between a command type and a first regular expression rule, and a command name matched with the first regular expression rule corresponds to the command type; the file type mapping table comprises a corresponding relation between a file type and a second regular expression rule, and information of an associated file matched with the second regular expression rule corresponds to the file type.
Optionally, the method further comprises: and inputting the predicted data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model, and determining an abnormal file and/or an abnormal command operated by a user according to the detection result.
Optionally, the detection result includes: the host user has abnormal behavior or the host user does not have abnormal behavior, and the abnormal file and/or the abnormal command operated by the user are determined according to the detection result, comprising the following steps: grouping the detection results according to a user group and a command type to obtain a first detection result group, wherein the first user group to which first process data corresponding to the first detection result group belongs and the first command type are consistent, and if the number of abnormal detection results in the first detection result group is greater than the number of normal detection results, determining that the user of the first user group is abnormal when executing the command of the first command type; and/or grouping the detection results according to the user groups and the file types to obtain a second detection result group, wherein the second user groups and the second file types, to which the second process data corresponding to the second detection result group belong, are consistent, and if the number of abnormal detection results in the second detection result group is greater than the number of detection results, determining that the user of the second user group has abnormality when operating the file of the second file type; and/or grouping the detection results according to the user group, the command type and the file type to obtain a third detection result group, wherein the third user group, the third file type and the third command type of the third process data corresponding to the third detection result group are consistent, and if the number of abnormal detection results in the third detection result group is greater than the number of normal detection results, determining that the user of the third user group has abnormality when executing the command of the third command type to operate the command of the third command type.
Optionally, the method further comprises: after determining the abnormal file and/or the abnormal command operated by the user according to the detection result, determining the target risk level of the abnormal behavior of the user according to the abnormal file and/or the abnormal command operated by the user; and outputting the alarm event of the target risk level.
According to a second aspect of the present disclosure, there is also provided a host user abnormal behavior detection apparatus, including: the acquisition module is used for acquiring the process data of the host to be detected; the extraction module is used for extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features; and the prediction module is used for inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model.
According to a third aspect of the present disclosure, there is also provided an electronic device, comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the host user abnormal behavior detection methods provided by the embodiments of the present disclosure via execution of the executable instructions.
According to a fourth aspect of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any one of the host user abnormal behavior detection methods provided by the embodiments of the present disclosure.
According to the method, the device, the storage medium and the electronic equipment for detecting the abnormal behavior of the host user, the process data of the host to be detected are obtained, and the user operation characteristics in the process data are extracted, wherein the user operation characteristics are characterized by the user characteristics, the command characteristics and the file characteristics, so that the user operation characteristics can be constructed by associating the user characteristics, the command characteristics and the file characteristics, the abnormal behavior of the host user can be detected based on the user operation characteristics by using the trained user abnormal behavior detection model, the detection result is obtained, the purpose of detecting the unknown abnormal behavior in the host according to the user operation characteristics is achieved, and the real-time performance of the host abnormal detection is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
FIG. 1 is a flowchart illustrating a method of detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure;
FIG. 2 is a flowchart illustrating a method of detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure;
FIG. 3 is a flowchart illustrating a method of detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure;
FIG. 4 is a flowchart illustrating a method of detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a device for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure;
fig. 6 is a schematic structural view of an electronic device according to an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
FIG. 1 is a flowchart illustrating a method for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure, as shown in FIG. 1, the method comprising:
in step S102, process data of a host to be detected is obtained;
In an exemplary embodiment, process data being executed by a host to be detected at a certain time may be collected, and the collected process data may include: the process execution program information of the host to be detected, such as a process execution program name, process execution user information, such as a process execution host user name, and file information operated by the process, wherein the files operated by the process include, but are not limited to, files opened by the process, files created by the process, files modified by the process, files deleted by the process, and files copied by the process. The process data at a certain moment may be one piece of process data or a plurality of pieces of process data.
In step S104, extracting user operation features based on the process data, to obtain a predicted data set, where the user operation features include: user features, command features, and file features;
the user operation characteristics can be used for representing the file operation condition of the user, and because the files in the operation system of the host are carriers for storing the system and various application information, an attacker cannot avoid file operation when carrying out malicious behavior operation. Therefore, based on the file operation condition of the user, the behavior of the host user can be more comprehensively described, and further the unknown abnormal behavior of the host can be detected according to the file operation condition of the user.
In an exemplary embodiment, the user characteristics may include user attribute characteristics, the command characteristics may include attribute characteristics possessed by a command operated by the user, and the file characteristics may include attribute characteristics of a file operated by the process.
In step S106, the prediction data set is input into a pre-trained abnormal user behavior detection model, so as to obtain a detection result output by the abnormal user behavior detection model.
In an exemplary embodiment, the user abnormal behavior detection model may be obtained by training in advance based on a training data set, where the training data set may include feature data extracted from process data of the acquisition host in a normal state and in an abnormal state, respectively. The collected process data of the host may include a process execution program name, a process execution user, and file information operated by the process. The labels of the feature data in the training dataset may include both abnormal and normal labels. The specific training process of the abnormal behavior detection model of the user will be described in detail later.
According to the method for detecting the abnormal behavior of the host user, the process data of the host to be detected are obtained, the user operation characteristics in the process data are extracted, wherein the user operation characteristics are characterized by the user characteristics, the command characteristics and the file characteristics, so that the user operation characteristics can be constructed by associating the user characteristics, the command characteristics and the file characteristics, the abnormal behavior of the host user can be detected based on the user operation characteristics by using the trained user abnormal behavior detection model, a detection result is obtained, the purpose of detecting the unknown abnormal behavior in the host according to the user operation characteristics is achieved, and the real-time performance of host abnormal detection is improved.
In embodiments of the present disclosure, the user characteristics may include a user group to which the user belongs.
The user may be a process execution user, and may acquire user information of the process execution user, such as a host user name, and determine a user group to which the user belongs according to the host user name.
The user group to which the process execution user belongs in the process data is used as the user characteristic in the user operation characteristics, so that the user group with the abnormality can be effectively positioned when the host is detected abnormally, and the abnormality can be accurately positioned to the user group.
In embodiments of the present disclosure, the command features may include at least one of:
the type of command, the character length of the command, the number of parameters in the command, and the special character duty cycle in the command.
Wherein the command type is related to the running environment of the host, the host is under different running environments, and the commands can be divided into different command types, for example, in the web server host environment, the command types can comprise a system process, a middleware application process and a database application process.
The character length of a command refers to the total number of characters in the command, including the number of space characters.
The number of parameters in the command refers to the number of parameters contained in the command, which is the same as the number of space characters contained in the command.
The special character duty ratio in a command refers to the proportion of special characters contained in the command to the total characters of the command, and in embodiments of the present disclosure the special characters may be defined as: characters other than "/", space characters "", case letters.
The characteristics of at least one dimension of the multi-dimensional characteristics of the command are adopted as the command characteristics in the user operation characteristics, and whether the user operation is abnormal or not can be measured by combining the characteristics of the command executed by the user.
In embodiments of the present disclosure, the file features may include at least one of:
file type, user group information to which the file belongs, file path hierarchy, file path information character length, and special character duty ratio in file name.
The file types are related to the running environment of the host, for example, in a Linux environment, files can be classified according to paths where the files are located, and the file types can be classified into binary files, configuration files and temporary files.
The user group information to which the file belongs refers to user group information to which the user to which the file belongs;
the file path layer refers to the number of layers of the file from the root directory to the folder in which the file is located;
the character length of the file path information refers to the total number of characters in the file path information;
The special character ratio in the file name refers to the proportion of the special characters in the complete path of the file to all valid characters, and in the embodiment of the present disclosure, the special characters may be defined as follows: characters other than "/", space characters "", case letters.
The characteristics of at least one dimension of the multi-dimensional characteristics of the file are adopted as the file characteristics in the user operation characteristics, and whether the user operation is abnormal or not can be measured by combining the characteristics of the file operated by the user.
In embodiments of the present disclosure, predicting sample data in a data set may include: user group, command type, character length of command, number of parameters in command, special character duty ratio in command, file type, user group to which file belongs, file path hierarchy, file path information character length and special character duty ratio in file name.
In an embodiment of the present disclosure, extracting user operation features based on the process data to obtain a predicted dataset may include:
extracting user information of the process data, and determining user group information of the process data according to a user group mapping table, wherein the user group mapping table comprises a corresponding relation between the user information and a user group; and/or the number of the groups of groups,
Extracting a command name of a process execution command of the process data, determining a command type of the process data according to a command type mapping table, and calculating a character length of the command, the number of parameters in the command and a special character duty ratio, wherein the command type mapping table comprises a corresponding relation between the command name and the command type;
in an exemplary embodiment, the user group to which the user belongs may be determined according to the correspondence between the user information and the user group in the pre-constructed user group mapping table, and in an exemplary embodiment, the user group mapping table may include a correspondence between a host user name and the user group.
In an exemplary embodiment, the COMMAND name part may be extracted by inputting COMMAND information of a data set, and the COMMAND type may be determined according to a correspondence between a COMMAND type and a COMMAND name in a pre-constructed COMMAND type mapping table, so as to obtain a corresponding COMMAND type;
calculating the character length of the COMMAND, namely calculating the total number of characters (including spaces) in the COMMAND by inputting COMMAND information to obtain the character length of the COMMAND;
calculating the number COM_OPTS of parameters in the COMMAND, wherein the number COM_OPTS can be equal to the number of space characters contained in the COMMAND;
Calculating the duty ratio COM_LETTER of special characters in the command;
the obtained command characteristic value corresponding to each piece of process data can comprise: the type of command, the character length of the command, the number of parameters in the command, and the duty cycle of the special characters in the command.
Extracting user operation characteristics based on the process data to obtain a predicted data set may include:
extracting information of an associated file of the process data, determining a file type of the associated file according to a file type mapping table, extracting user information of the associated file, determining a user group of the associated file according to a user group mapping table, and calculating a file path level of the associated file, a character length of the file path information and a special character duty ratio in a file name, wherein the file type mapping table comprises a corresponding relation between the information of the associated file and the file type;
in an exemplary embodiment, the information of the associated file of the process data may be a file path, and the file type may be determined according to the corresponding relationship between the file path and the file type in the pre-constructed file type mapping table, so as to obtain the corresponding file type.
Determining the user group to which the association file belongs may include: extracting user information of an associated file in the process data, determining user group information of each piece of process data according to a corresponding relation between the user information and a user group in a pre-constructed user group mapping table, and adding corresponding user group information for each piece of process data;
Calculating a file path level, wherein the file path level refers to the number of folder layers contained from a root directory to a folder in which a file is located;
calculating the character length (also called as the file character length) of the file path information, and calculating the total number of characters in the file path information to obtain the character length of the file path information;
the special character duty ratio in the file name, the proportion of the special characters in the complete path of the file to all valid characters is calculated, and in the embodiment of the disclosure, the special characters can be defined as: characters other than "/", space characters "", and upper and lower case letters.
The obtained corresponding file characteristic value in each piece of process data can comprise: file type, user group information to which the file belongs, file path hierarchy, file path information character length, and special character duty ratio in file name.
The user information, the command information and the file information are added to the data in the prediction data set by utilizing the pre-constructed user group mapping table, the command type mapping table and the file type mapping table, so that the efficiency of constructing the prediction data set can be effectively improved.
In an embodiment of the present disclosure, the command type mapping table may include a correspondence between a command type and a first regular expression rule, where a command name matched with the first regular expression rule corresponds to the command type;
For example, (RULE 1, command type 1), a command corresponding to a command name that indicates that the first regular expression RULE1 is matched can be classified as command type 1;
the file type mapping table comprises a corresponding relation between a file type and a second regular expression rule, and information of an associated file matched with the second regular expression rule corresponds to the file type.
The information of the associated file takes a file path as an example, for example (RULE 1, file type 1), and the file corresponding to the file path which represents the RULE RULE1 matching the second regular expression can be classified as the file type 1.
The correspondence between the named names and the command types in the command type mapping table is expressed as the correspondence between the first regular expression rule and the command types, so that commands corresponding to various command types can be matched by using the regular expression rule quickly and efficiently. Similarly, the corresponding relation between the file name and the file type in the file type mapping table is expressed as the corresponding relation between the second regular expression rule and the file type, and files corresponding to various file types can be quickly and efficiently matched by using the regular expression rule.
FIG. 2 is a flowchart illustrating a method for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure, as shown in FIG. 2, where the method may further include, based on the method shown in FIG. 1:
In step S202, after the prediction data set is input into a pre-trained abnormal user behavior detection model to obtain a detection result output by the abnormal user behavior detection model, an abnormal file and/or an abnormal command operated by the user are determined according to the detection result.
In an exemplary embodiment, the user abnormal behavior detection model may output a detection result of each piece of process data, and the detection result of each piece of process data may include: the piece of process data is normal, or the piece of process data is abnormal. In the embodiment of the disclosure, the data in the prediction data set may be added with user information, command information and file information, and on the basis of this, the detection result of each piece of process data is combined with the user information, command information and file information corresponding to the piece of process data, so that the user who has abnormal behavior, the file operated by the user who has abnormal behavior and the command executed by the user who has abnormal behavior can be determined.
In an embodiment of the present disclosure, the detection result may include: FIG. 3 is a flowchart illustrating a method for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure, where, as shown in FIG. 3, determining, according to the detection result, an abnormal file and/or an abnormal command operated by the user may include:
In step S2022, the detection results are grouped according to the user groups and the command types to obtain a first detection result group, where the first user groups to which the first process data corresponding to the first detection result group belong and the first command types are consistent, and if the number of abnormal detection results in the first detection result group is greater than the number of normal detection results, it is determined that the user of the first user group has an abnormality when executing the command of the first command type; and/or
The first process data is input into a user abnormal behavior detection model, and the detection result output by the model is the first detection result, so the first detection result is called as corresponding to the first process data.
Optionally, the detection results, of all the detection results output by the abnormal behavior detection model, of which the user group is consistent with the command type can be divided into a group, so as to obtain a first detection result group. For example, if the user group is denoted as GID, the command TYPE is denoted as com_type, the detection result is denoted as RISK_STATUS, and the first detection result group may be denoted as gid=gid1, com_type=com_TYPE 1. If count (gid=gid1, com_type=com_type 1, rim_status=abnormal) > count (GID 1, com_type1, rim_status=normal), it may be determined that the user group GID1 has an abnormal state in the command of executing the com_type1 command TYPE.
In step S2024, the detection results are grouped according to the user groups and the file types to obtain a second detection result group, wherein the second user groups to which the second process data corresponding to the second detection result group belong and the second file types are consistent, and if the number of abnormal detection results in the second detection result group is greater than the number of detection results, it is determined that the user of the second user group has an abnormality when operating the file of the second file type; and/or
The second process data is input into the user abnormal behavior detection model, and the detection result output by the model is the second detection result, so the second detection result is called as corresponding to the second process data.
For example, among all the detection results output by the abnormal behavior detection model, the detection results with the same user group and file type may be divided into one group, so as to obtain a second detection result group. For example, if the user group is denoted as GID, the file TYPE is denoted as fd_type, the detection result is denoted as RISK_STATUS, and the second detection result group may be denoted as gid=gid1, fd_type=fd_TYPE 1. If count (gid=gid1, fd_type=fd_type 1, rim_status=abnormal) > count (GID 1, fd_type1, rim_status=normal), then the user group GID1 has an abnormal state in the file operating fd_type1 TYPE.
In step S2026, the detection results are grouped according to the user group, the command type and the file type to obtain a third detection result group, where a third user group, a third file type and a third command type of third process data corresponding to the third detection result group are all consistent, and if the number of abnormal detection results in the third detection result group is greater than the number of normal detection results, it is determined that the user of the third user group has an abnormality when executing the process of executing the command of the third command type to operate the third operation command.
The third process data is input into the user abnormal behavior detection model, and the detection result output by the model is the third detection result, so the third detection result is called as corresponding to the third process data.
For example, the detection results of which the user group, the command type and the file type are consistent among all the detection results output by the abnormal behavior detection model of the user may be divided into a group, so as to obtain a third detection result group. For example, if the user group is denoted as GID, the command TYPE is denoted as com_type, the file TYPE is denoted as fd_type, and the detection result is denoted as RISK_STATUS, the third detection result group may be denoted as gid=gid1, fd_type=fd_type 1, com_type=com_type 1. If count (gid=gid1, com_type=com_type 1, fd_type1, risk_status=abnormal) > count (gid=gid1, com_type=com_type 1, fd_type=fd_type 1, risk_status=normal), the user group GID1 operates an abnormal state in the file of the fd_type1 TYPE in the course of executing the com_type command TYPE.
It should be noted that fig. 3 is only an example of the method for detecting abnormal behavior of the host user including steps S2022 to S2026.
FIG. 4 is a flowchart illustrating a method for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure, as shown in FIG. 4, where the method may further include, based on the method shown in FIG. 1:
in step S402, after determining the abnormal file and/or the abnormal command operated by the user according to the detection result, determining a target risk level of the abnormal behavior of the user according to the abnormal file and/or the abnormal command operated by the user;
in an exemplary embodiment, determining the abnormality based on the detection result may include determining that the user group has an abnormality in executing one or more types of commands; the user group has abnormality when operating a certain file or files of a certain type; or when the user group operates one or more types of files in the process of executing one or more commands, the three anomalies can respectively correspond to three different alarm levels, such as a first anomaly corresponding to an alarm level A, a second anomaly corresponding to an alarm level B and a third anomaly corresponding to an alarm level C.
In step S404, an alarm event of the target risk level is output.
In an exemplary embodiment, after determining the target risk level of the user abnormal behavior, the user specific abnormal behavior may be output, for example, information of the abnormal user group in the detection result, a command type of a command executed by the abnormal user group, and/or a file type of a file operated by the abnormal user group may be output. Alternatively, the target risk level may be output at the same time, and along with the above example, the alarm level a, the alarm level B or the alarm level C may be directly output, so that the alarm level may be simply and definitely used to alarm the current occurrence of the abnormal situation.
The following describes a training process of the user abnormal behavior detection model in the embodiment of the present disclosure.
Construction and processing of training data sets:
a history process record data set of the host is constructed, and the name of an execution program of the host process, the execution user of the process and the file information opened by the process are respectively extracted under the normal state and the abnormal state of the host;
for example, all executing process information in the system can be acquired, including process USER, PID, COMMAND information, where USER is a process execution USER, PID is a process number, and COMMAND is a complete COMMAND of process execution;
Acquiring file information opened by a process by utilizing all the acquired PIDs;
determining a user to which the file belongs;
repeating the steps at preset time intervals, such as 10 minutes, and obtaining the process information of the host in a certain time range;
and (3) adjusting the host to be in a risk state, such as enabling the host to execute malicious software and the like, and recording information of a newly added process, wherein the newly added process can also comprise a process execution user, a process number and a complete command of process execution.
Adding a normal or abnormal risk state label to the data in the history process record data set of the host according to the host state;
extracting the corresponding relation between the host user name and the user group, and storing the corresponding relation in a user group mapping table;
constructing a process type mapping table: inputting all process execution commands in data in a history process record data set of a host, and extracting a command name part; defining a command type, such as command type 1, command type 2 … command type n, according to the running environment condition of the host; defining a regular expression rule corresponding to each command type; and forming a complete command type mapping table, wherein the command type mapping table can comprise a corresponding relation between a regular expression and a command type, and a command name conforming to a regular expression rule corresponding to a certain command type corresponds to the command type.
Constructing a file type mapping table: inputting associated file information of all processes in data in a history process record data set of a host, and removing repeated information to form a process file set; defining file types, such as file type 1 and file type 2 … file type n, according to the running environment condition of the host; defining a regular expression rule corresponding to each file type; and forming a complete file type mapping table, wherein the file type mapping table can comprise a corresponding relation between the regular expression and the file type, and the file name conforming to the regular expression rule corresponding to a certain file type corresponds to the file type.
And adding user group, command type and file type information to each piece of data in the history process record data set of the host according to the user group mapping table, the command type mapping table and the file type mapping table to obtain a training data set.
The following is an example of information extraction using a Linux system host:
executing a ps-aux COMMAND to acquire all executing process information in a system, including a process USER, a PID and COMMAND information, wherein the USER is a process execution USER, the PID is a process number, and the COMMAND is a complete COMMAND (including a COMMAND name and a COMMAND parameter) executed by the process, for example, user=root, PID=1234, command=/bin/flash/bin/app 1;
Executing the command ls-l/proc/[ PID ]/fd/, if PID=1234, by utilizing all the obtained PIDs in sequence, and executing ls-l/proc/1234/fd/, so as to obtain file information opened by a process with the process number 1234, wherein the result is a file path FDINFO1=/var/lib/app 1/data1, FDINFO2=/var/lib/app 1/data2 …;
acquiring users to which the file belongs, executing ls-l [ FDINFO1], such as ls-l/var/lib/app1/data1, and extracting users FD_OWNER to which the file belongs from the result;
the host process information in TIME1 at a certain moment is obtained as follows:
{TIME1,USER、PID、COMMAND,FDINFO,FD_OWNER}
repeating the steps at specific intervals, such as 10 minutes, to obtain process information [ { TIME1, USER, PID, COMMAND, FDINFO, FD _OWNER }, { TIME2, USER, PID, COMMAND, FDINFO, FD _OWNER }, { TIME3, USER, PID, COMMAND, FDINFO, FD _OWNER } … ] of the host within a certain TIME range;
the host is adjusted to be in a risk state, such as malicious software is executed, and information of a new process is recorded;
and adding a RISK state label RISK_STATUS to the data in the training data set according to the host state to obtain { TIME1, USER, PID, COMMAND, FDINFO, FD _OWNER, RISK_STATUS }, wherein RISK_STATUS = normal/abnormal.
Constructing a user group mapping table:
Extracting the corresponding relation between the host USER name and the GID (USER group ID), if cat/etc/group is executed, obtaining GID=0 corresponding to the USER root, and storing the GID=0 in a USER group mapping table GID_MAP= [ { USER1, GID1}, { USER2, GID1} … ];
constructing a command type mapping table:
all processes in the input data execute COMMANDs, extract a COMMAND NAME part, for example, divide COMMAND information by space, obtain first part information, namely, COMMAND NAMEs, for example, command=/bin/bash/usr/bin/app 1, COMMAND NAMEs/bin/bash, and form a process COMMAND NAME set [ COM_NAME1, COM_NAME 2, COM_NAME 3 … ] after removing repeated information;
according to the running environment of the host, a command TYPE, com_type [ command TYPE 1, command TYPE 2, command TYPE 3 … ] is defined, for example, com_type [ system process, middleware application process, database application process … ] may be defined in the web server host environment
Defining a corresponding regular expression RULE for each command type, such as (RULE 1, command type 1), the execution command name representing the matching RULE RULE1 being classifiable as command type 1;
form a complete command type mapping table { (RULE 1, command type 1), (RULE 2, command type 2) … (RULEx, command type n) }
Constructing a file type mapping table:
inputting all process-related file information FDINFO in the data, and removing the repeated information to form a process file set [ FDINFO1, FDINFO2, FDINFO3 … ];
according to the condition of the host operation environment, file types, such as file type 1, file type 2 …, file type n, for example, in the Linux environment, file types [ binary file, configuration file, temporary file … ] can be defined according to the path classification of the file
Defining regular expression RULEs corresponding to each file type, such as (RULE 1, file type 1), wherein files representing the matching RULE RULE1 can be classified as file type 1;
a complete file type mapping table { (run 1, file type 1), (run 2, file type 2) … (run, file type n) } is formed.
And adding user group, command type and file type information for each piece of data in the history process record data set of the host according to the user group mapping table, the command type mapping table and the file type mapping table.
The following describes the extraction of user features in the training process of the abnormal behavior detection model of the user:
inputting a data set [ { TIME1, USER, PID, COMMAND, FDINFO, FD _OWNER }, { TIME2, USER, PID, COMMAND, FDINFO, FD _OWNER }, { TIME3, USER, PID, COMMAND, FDINFO, FD _OWNER } … ], and extracting data piece by piece for characteristic analysis;
User group mapping: extracting USER information in the data, and adding corresponding USER group information { TIME1, USER, PID, COMMAND, FDINFO, FD _OWNER, GID } for each piece of data by using a USER group mapping table GID_MAP
Command feature extraction:
command type: inputting COMMAND information of a data set, extracting a COMMAND name part, and obtaining a corresponding COMMAND TYPE COM_TYPE according to a COMMAND TYPE mapping table;
command character LENGTH com_length: inputting COMMAND information, and calculating the total number of characters in the COMMAND (including spaces), such as command=/bin/flash/usr/bin/app 1, com_length=23, wherein the COMMAND character LENGTH is 23;
number of parameters in command com_opts: com_opts is equal to the number of space characters contained in the COMMAND, if com_opts=1, if command=/bin/bash/usr/bin/app 1;
command special character duty cycle com_player: com_filter is calculated according to the following formula:
com_length=number of special characters in command/(com_length-com_opts);
alphabetic characters (including case letters) duty ratio in COMMAND: e.g., command=/bin/flash/bin/app 2-2.3, where com_filter=5/(23-1) =0.23;
corresponding command characteristic values (COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER) in each piece of data are obtained, and are added into a data set to obtain { TIME1, USER, PID, COMMAND (COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER), FDINFO, FD_OWNER, GID }.
Extracting file characteristics:
file type: inputting FDINFO information of a data set, and obtaining a corresponding file TYPE FD_TYPE according to a file TYPE mapping table;
user mapping to which the file belongs: extracting FD_OWNER in the data, and adding corresponding user group information for each piece of data by utilizing a user group mapping table GID_MAP to obtain FD_OWNER_GID;
calculating a file path LEVEL fd_level equal to the number of "/" characters contained in FDINFO, for example fdinfo=/var/lib/app 1/data1, fd_level=4;
the file name character LENGTH fd_length, the total number of characters in the FDINFO is calculated, such as fdinfo=/var/lib/app 1/data1, fd_length=19;
the filename special character duty cycle fd_filter, fd_filter is calculated according to the following formula:
fd_length=number of special characters in FDINFO/(fd_length-fd_level);
such as fdinfo=/var/lib/app 1/data1, where the number of LETTERs is 13, fd_letter=2/(19-4) =0.13;
obtaining corresponding file characteristic values (FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH and FD_LETTER) in each piece of data;
in combination with the command features, processed sample data (GID, COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER, FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH, FD_LETTER, RISK_STATUS) are obtained.
When the user abnormal behavior detection model is used for prediction, the extraction of the user operation features in the prediction data set is consistent with the extraction process of the user operation features in the training data, and will not be described herein.
After obtaining sample data, inputting the sample data, and carrying out normalization processing on each field of each sample data; inputting the normalized sample data into a machine learning model, such as a neural network model or an SVM (Support Vector Machines, support vector machine) model for training; and outputting the trained abnormal behavior detection model of the user.
The following describes a process of identifying user abnormal behavior of a host to be detected by using a user abnormal behavior detection model:
collecting process data of a host to be detected at a certain moment, wherein the process data can comprise a host process execution program name, a process execution user and file information opened by a process,
referring to the extraction of user characteristics in the model training process, extracting characteristics of the collected host data to obtain a test data set;
test data sets such as [ (GID 1, COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER, FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH, FD_LETTER ], (GID 2, COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER, FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH, FD_LETTER) … ];
And inputting the test data set into the trained user abnormal behavior detection model to obtain a risk state value corresponding to each process data.
The RISK STATUS value corresponding to each piece of data is [ (GID 1, COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER, FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH, FD_LETTER, RISK_STATUS), (GID 2, COM_TYPE, COM_LENGTH, COM_OPTS, COM_LETTER, FD_TYPE, FD_OWNER_GID, FD_LEVEL, FD_LENGTH, FD_LETTER, RISK_STATUS) … ].
The following describes a procedure for performing abnormality determination based on a detection result output by the user abnormal behavior detection model.
Grouping the detection results output by the user abnormal behavior detection model by GID, com_type and risk_status, and if count (gid=gid1, com_type=com_type 1, risk_status=abnormal) > count (GID 1, com_type1, risk_status=normal), then the user group GID1 has an abnormal state in executing the com_type1 command TYPE, outputting a level a alarm event;
grouping the detection results output by the user abnormal behavior detection model by GID, fd_type and RISK_STATUS, and if count (gid=gid1, fd_type=fd_TYPE 1, RISK_STATUS=abnormal) > count (GID 1, fd_TYPE1, RISK_STATUS=normal), then the user group GID1 has an abnormal state in the operation fd_TYPE1 TYPE file, outputting a level B alarm event;
Grouping the detection results output by the user abnormal behavior detection model by GID, com_type, fd_type, and rim_status, and if count gid=gid1, com_type=com_type 1, fd_type1, rim_status=abnormal) > count (gid=gid1, com_type=com_type 1, fd_type=fd_type 1, rim_status=normal), then the user group GID1 has an abnormal state in the fd_type1 TYPE file in which the com_type command TYPE process is performed, outputting a level C alarm event;
the alarm level A, B, C can be defined by the application scenario.
Fig. 5 is a schematic structural diagram of a device for detecting abnormal behavior of a host user according to an exemplary embodiment of the present disclosure, and as shown in fig. 5, the device 510 includes:
an obtaining module 512, configured to obtain process data of a host to be detected;
an extracting module 514, configured to extract user operation features based on the process data, to obtain a predicted data set, where the user operation features include: user features, command features, and file features;
and the prediction module 516 is configured to input the prediction data set into a pre-trained abnormal user behavior detection model, and obtain a detection result output by the abnormal user behavior detection model.
In an embodiment of the present disclosure, the process data may include: at least one of process execution program information, process execution user information and file information operated by a process of a host to be detected.
In embodiments of the present disclosure, the user characteristics may include a user group to which the user belongs.
In embodiments of the present disclosure, the command features may include at least one of:
the type of command, the character length of the command, the number of parameters in the command, and the special character duty cycle in the command.
In embodiments of the present disclosure, the file features may include at least one of:
file type, user group to which the file belongs, file path hierarchy, file path information character length, and special character duty ratio in file name.
In an embodiment of the disclosure, the extraction module is specifically configured to:
extracting user information of the process data, and determining user group information of the process data according to a user group mapping table, wherein the user group mapping table comprises a corresponding relation between the user information and a user group;
and/or
Extracting a command name of a process execution command of the process data, determining a command type of the process data according to a command type mapping table, and calculating a character length of the command, the number of parameters in the command and a special character duty ratio, wherein the command type mapping table comprises a corresponding relation between the command name and the command type;
And/or
Extracting information of an associated file of the process data, determining a file type of the process data according to a file type mapping table, extracting user information of the associated file in the process data, determining a user group of the associated file according to a user group mapping table, and calculating a file path hierarchy of the process data, a character length of the file path information and a special character duty ratio in a file name, wherein the file type mapping table comprises a corresponding relation between the information of the associated file and the file type.
In an embodiment of the present disclosure, the command type mapping table includes a correspondence between a command type and a first regular expression rule, and a command name matched with the first regular expression rule corresponds to the command type; the file type mapping table comprises a corresponding relation between a file type and a second regular expression rule, and information of an associated file matched with the second regular expression rule corresponds to the file type.
In an embodiment of the present disclosure, the host user abnormal behavior detection apparatus may further include:
the first determining module is used for determining an abnormal file and/or an abnormal command operated by a user according to the detection result after inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain the detection result output by the user abnormal behavior detection model.
In an embodiment of the present disclosure, the detection result may include: the host user has abnormal behavior or the host user does not have abnormal behavior, and the first determining module is specifically configured to:
grouping the detection results according to a user group and a command type to obtain a first detection result group, wherein the first user group to which first process data corresponding to the first detection result group belongs and the first command type are consistent, and if the number of abnormal detection results in the first detection result group is greater than the number of normal detection results, determining that the user of the first user group is abnormal when executing the command of the first command type; and/or
Grouping detection results according to the user groups and the file types to obtain a second detection result group, wherein the second user groups and the second file types, to which second process data corresponding to the second detection result group belong, are consistent, and if the number of abnormal detection results in the second detection result group is greater than the number of detection results, determining that the user of the second user group is abnormal when operating the file of the second file type; and/or
And grouping the detection results according to the user group, the command type and the file type to obtain a third detection result group, wherein a third user group, a third file type and a third command type of third process data corresponding to the third detection result group are consistent, and if the number of abnormal detection results in the third detection result group is greater than the number of normal detection results, determining that the user of the third user group is abnormal when the process of executing the command of the third command type operates the command of the third command type.
In an embodiment of the present disclosure, the host user abnormal behavior detection apparatus may further include:
the second determining module is used for determining a target risk level of abnormal behaviors of the user according to the abnormal files and/or abnormal commands operated by the user after determining the abnormal files and/or abnormal commands operated by the user according to the detection result;
and the output module is used for outputting the alarm event of the target risk level.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
As shown in fig. 6, the electronic device 600 is in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: the at least one processing unit 610, the at least one memory unit 620, and a bus 630 that connects the various system components, including the memory unit 620 and the processing unit 610.
Wherein the storage unit stores program code that is executable by the processing unit 610 such that the processing unit 610 performs steps according to various exemplary embodiments of the present invention described in the above-described "exemplary methods" section of the present specification. For example, the processing unit 610 may perform step S102 as shown in fig. 1: acquiring process data of a host to be detected; in step S104: extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features; step S106: and inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model.
The storage unit 620 may include readable media in the form of volatile storage units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. As shown, network adapter 660 communicates with other modules of electronic device 600 over bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
In an exemplary embodiment of the present disclosure, a computer-readable storage medium having stored thereon a program product capable of implementing the method described above in the present specification is also provided. In some possible embodiments, the various aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the "exemplary methods" section of this specification, when said program product is run on the terminal device.
A program product for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read-only memory (CD-ROM) and comprise program code and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
Furthermore, although the steps of the methods in the present disclosure are depicted in a particular order in the drawings, this does not require or imply that the steps must be performed in that particular order or that all illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, including several instructions to cause a computing device (may be a personal computer, a server, a mobile terminal, or a network device, etc.) to perform the method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (13)

1. A method for detecting abnormal behavior of a host user, comprising:
acquiring process data of a host to be detected;
extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features;
inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model, wherein the user abnormal behavior detection model is obtained by inputting normalized sample user operation characteristics into a machine learning model for training, and the machine learning model is a neural network model or a support vector machine model;
Extracting user operation characteristics based on the process data to obtain a predicted data set, wherein the method comprises the following steps:
extracting information of an associated file of the process data, determining a file type of the associated file according to a file type mapping table, extracting user information of the associated file, determining a user group of the associated file according to a user group mapping table, and calculating a file path level of the associated file, a character length of the file path information and a special character duty ratio in a file name, wherein:
the file type mapping table comprises the corresponding relation between the information of the associated file and the file type and the corresponding relation between the file path and the file type;
the user group mapping table comprises a corresponding relation between user information and a user group;
the file path layer is the number of layers from the root directory to the folder in which the file is located;
the character length of the file path information is the total number of characters in the file path information;
the special characters are characters except for "/", space characters "", and upper and lower case letters.
2. The method of claim 1, wherein the process data comprises:
At least one of process execution program information, process execution user information and file information operated by a process of a host to be detected.
3. The method of claim 1, wherein the user characteristics comprise a group of users to which the user belongs.
4. The method of claim 1, wherein the command features include at least one of:
the type of command, the character length of the command, the number of parameters in the command, and the special character duty cycle in the command.
5. The method of claim 1, wherein the file characteristics include at least one of:
file type, user group to which the file belongs, file path hierarchy, file path information character length, and special character duty ratio in file name.
6. The method of claim 1, wherein extracting user operational features based on the process data to obtain a predicted dataset comprises:
extracting user information of the process data, and determining user group information of the process data according to a user group mapping table, wherein the user group mapping table comprises a corresponding relation between the user information and a user group;
and/or
And extracting a command name of a process execution command of the process data, determining a command type of the process data according to a command type mapping table, and calculating the character length of the command, the number of parameters in the command and the special character duty ratio, wherein the command type mapping table comprises a corresponding relation between the command name and the command type.
7. The method of claim 6, wherein the step of providing the first layer comprises,
the command type mapping table comprises a corresponding relation between a command type and a first regular expression rule, and a command name matched with the first regular expression rule corresponds to the command type;
the file type mapping table comprises a corresponding relation between a file type and a second regular expression rule, and information of an associated file matched with the second regular expression rule corresponds to the file type.
8. The method according to claim 1, wherein the method further comprises:
and inputting the predicted data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model, and determining an abnormal file and/or an abnormal command operated by a user according to the detection result.
9. The method of claim 8, wherein the detection result comprises: the host user has abnormal behavior or the host user does not have abnormal behavior, and the abnormal file and/or the abnormal command operated by the user are determined according to the detection result, comprising the following steps:
grouping the detection results according to a user group and a command type to obtain a first detection result group, wherein the first user group to which first process data corresponding to the first detection result group belongs and the first command type are consistent, and if the number of abnormal detection results in the first detection result group is greater than the number of normal detection results, determining that the user of the first user group is abnormal when executing the command of the first command type; and/or
Grouping the detection results according to the user groups and the file types to obtain a second detection result group, wherein the second user groups and the second file types, to which the second process data corresponding to the second detection result group belong, are consistent, and if the number of abnormal detection results in the second detection result group is greater than the number of normal detection results, determining that the user of the second user group is abnormal when operating the file of the second file type; and/or
And grouping the detection results according to the user group, the command type and the file type to obtain a third detection result group, wherein a third user group, a third file type and a third command type of third process data corresponding to the third detection result group are consistent, and if the number of abnormal detection results in the third detection result group is greater than the number of normal detection results, determining that the user of the third user group is abnormal when the process of executing the command of the third command type operates the command of the third command type.
10. The method of claim 8, wherein the method further comprises:
after determining the abnormal file and/or the abnormal command operated by the user according to the detection result, determining the target risk level of the abnormal behavior of the user according to the abnormal file and/or the abnormal command operated by the user;
And outputting the alarm event of the target risk level.
11. A host user abnormal behavior detection apparatus, comprising:
the acquisition module is used for acquiring the process data of the host to be detected;
the extraction module is used for extracting user operation characteristics based on the process data to obtain a prediction data set, wherein the user operation characteristics comprise: user features, command features, and file features;
the extracting module is further configured to extract information of an associated file of the process data, determine a file type of the associated file according to a file type mapping table, extract user information of the associated file, determine a user group of the associated file according to a user group mapping table, and calculate a file path hierarchy of the associated file, a character length of the file path information, and a special character ratio in a file name, where:
the file type mapping table comprises the corresponding relation between the information of the associated file and the file type and the corresponding relation between the file path and the file type;
the user group mapping table comprises a corresponding relation between user information and a user group;
the file path layer is the number of layers from the root directory to the folder in which the file is located;
The character length of the file path information is the total number of characters in the file path information;
the special characters are characters except for "/", space characters "", and upper and lower case letters;
the prediction module is used for inputting the prediction data set into a pre-trained user abnormal behavior detection model to obtain a detection result output by the user abnormal behavior detection model, wherein the user abnormal behavior detection model is obtained by inputting normalized sample user operation characteristics into a machine learning model for training, and the machine learning model is a neural network model or a support vector machine model.
12. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the host user abnormal behavior detection method of any one of claims 1 to 10 via execution of the executable instructions.
13. A computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the host user abnormal behavior detection method of any one of claims 1 to 10.
CN202210357958.0A 2022-04-06 2022-04-06 Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment Active CN114896588B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210357958.0A CN114896588B (en) 2022-04-06 2022-04-06 Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210357958.0A CN114896588B (en) 2022-04-06 2022-04-06 Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN114896588A CN114896588A (en) 2022-08-12
CN114896588B true CN114896588B (en) 2024-02-23

Family

ID=82715140

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210357958.0A Active CN114896588B (en) 2022-04-06 2022-04-06 Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN114896588B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111277606A (en) * 2020-02-10 2020-06-12 北京邮电大学 Detection model training method, detection method and device, and storage medium
CN111327632A (en) * 2020-03-06 2020-06-23 深信服科技股份有限公司 Zombie host detection method, system, equipment and storage medium
CN111651767A (en) * 2020-06-05 2020-09-11 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160203316A1 (en) * 2015-01-14 2016-07-14 Microsoft Technology Licensing, Llc Activity model for detecting suspicious user activity

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111277606A (en) * 2020-02-10 2020-06-12 北京邮电大学 Detection model training method, detection method and device, and storage medium
CN111327632A (en) * 2020-03-06 2020-06-23 深信服科技股份有限公司 Zombie host detection method, system, equipment and storage medium
CN111651767A (en) * 2020-06-05 2020-09-11 腾讯科技(深圳)有限公司 Abnormal behavior detection method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于用户画像的大数据环境中异常特征提取;郭娜;魏荣凯;沈焱萍;;计算机仿真(08);第338-342页 *

Also Published As

Publication number Publication date
CN114896588A (en) 2022-08-12

Similar Documents

Publication Publication Date Title
CN110321371B (en) Log data anomaly detection method, device, terminal and medium
Baldwin et al. Leveraging support vector machine for opcode density based detection of crypto-ransomware
D’Ambros et al. Evaluating defect prediction approaches: a benchmark and an extensive comparison
AU2017274576B2 (en) Classification of log data
Gopal et al. Statistical learning for file-type identification
JP2018045403A (en) Abnormality detection system and abnormality detection method
Yerima et al. Longitudinal performance analysis of machine learning based Android malware detectors
CN103109295B (en) Be created in the system and method for the customization confidence belt used in malware detection
US20210136120A1 (en) Universal computing asset registry
US11704186B2 (en) Analysis of deep-level cause of fault of storage management
CN111199469A (en) User payment model generation method and device and electronic equipment
Koucham et al. Host intrusion detection using system call argument-based clustering combined with Bayesian classification
Okane et al. Malware detection: program run length against detection rate
JP2016192185A (en) Spoofing detection system and spoofing detection method
CN111210332A (en) Method and device for generating post-loan management strategy and electronic equipment
Agrawal et al. Predicting co‐change probability in software applications using historical metadata
US20220405184A1 (en) Method, electronic device, and computer program product for data processing
CN116702229B (en) Safety house information safety control method and system
CN114896588B (en) Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment
CN110059480A (en) Attack monitoring method, device, computer equipment and storage medium
CN113076217B (en) Disk fault prediction method based on domestic platform
US11822578B2 (en) Matching machine generated data entries to pattern clusters
JP6508202B2 (en) INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM
Pithode et al. A Study on Log Anomaly Detection using Deep Learning Techniques
CN111177704B (en) Binding identification method, binding identification device, binding identification equipment and binding identification medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant