CN111651767A - Abnormal behavior detection method, device, equipment and storage medium - Google Patents

Abnormal behavior detection method, device, equipment and storage medium Download PDF

Info

Publication number
CN111651767A
CN111651767A CN202010507008.2A CN202010507008A CN111651767A CN 111651767 A CN111651767 A CN 111651767A CN 202010507008 A CN202010507008 A CN 202010507008A CN 111651767 A CN111651767 A CN 111651767A
Authority
CN
China
Prior art keywords
detected
sequence
abnormal
feature
process operation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010507008.2A
Other languages
Chinese (zh)
Inventor
周菲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202010507008.2A priority Critical patent/CN111651767A/en
Publication of CN111651767A publication Critical patent/CN111651767A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/56Computer malware detection or handling, e.g. anti-virus arrangements
    • G06F21/566Dynamic detection, i.e. detection performed at run-time, e.g. emulation, suspicious activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Virology (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application relates to an abnormal behavior detection method, an abnormal behavior detection device, abnormal behavior detection equipment and a storage medium, wherein the method comprises the following steps: acquiring a process operation behavior sequence, dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of pieces of field information; extracting features based on the attribute values of the plurality of field information of each process operation behavior in the sequence to be detected to obtain a feature vector corresponding to the sequence to be detected; carrying out anomaly detection on the feature vectors corresponding to the sequences to be detected, and determining the feature vectors detected as anomalies as anomaly feature vectors; determining the abnormal degree of each dimension feature in the abnormal feature vector; determining abnormal feature items in the abnormal feature vector based on the abnormality degree of each dimension feature in the abnormal feature vector; the method and the device can improve the accuracy and efficiency of the abnormal behavior detection of the user.

Description

Abnormal behavior detection method, device, equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for detecting abnormal behavior.
Background
With the continuous development of computer technology, the provision of services for users through the computer technology becomes convenient and fast, but can be utilized by illegal industries, and the illegal industries have certain adverse effects on normal users and computer technology service platforms; in order to reduce the influence of illegal industries on normal users and service platforms, it is necessary to analyze user behavior data and supervise various activities of users according to the analysis result, thereby discovering abnormal users in time.
In the prior art, when user behaviors are detected, a single user behavior is generally matched with a preset matching rule, the preset matching rule is simple and is obtained based on limited experience, and the strong rule is generally difficult to perform matching detection based on the weak rule, so that the accuracy and efficiency of detecting abnormal behaviors of the user are low, and an effective abnormal behavior detection method is needed for detecting the behaviors of the user.
Disclosure of Invention
The technical problem to be solved by the present application is to provide a method, an apparatus, a device and a storage medium for detecting abnormal behavior, which can improve accuracy and efficiency of detecting abnormal behavior of a user.
In order to solve the above technical problem, in one aspect, the present application provides an abnormal behavior detection method, including:
acquiring a process operation behavior sequence, dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of pieces of field information;
for each sequence to be detected, extracting features based on attribute values of the plurality of fields of information of each process operation behavior in the sequence to be detected to obtain a feature vector corresponding to the sequence to be detected;
carrying out anomaly detection on the feature vectors corresponding to the sequences to be detected, and determining the feature vectors detected as anomalies as anomaly feature vectors;
determining the abnormal degree of each dimension feature in the abnormal feature vector;
and determining the abnormal feature items in the abnormal feature vector based on the abnormality degree of each dimension feature in the abnormal feature vector.
In another aspect, the present application provides an abnormal behavior detection apparatus, including:
the behavior sequence dividing module is used for acquiring a process operation behavior sequence and dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of pieces of field information;
the characteristic extraction module is used for extracting characteristics of each sequence to be detected based on the attribute values of the plurality of fields of information of each process operation behavior in the sequence to be detected to obtain a characteristic vector corresponding to the sequence to be detected;
the anomaly detection module is used for carrying out anomaly detection on the characteristic vectors corresponding to the sequences to be detected and determining the characteristic vectors detected as anomalies as anomaly characteristic vectors;
the abnormal degree determining module is used for determining the abnormal degree of each dimension characteristic in the abnormal characteristic vector;
and the abnormal feature item determining module is used for determining the feature items with the abnormality in the abnormal feature vector based on the abnormality degree of each dimension feature in the abnormal feature vector.
In another aspect, the present application provides an abnormal behavior detection apparatus, which includes a processor and a memory, where the memory stores at least one instruction or at least one program, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the abnormal behavior detection method as described above.
In another aspect, the present application provides a computer storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded by a processor and executes the abnormal behavior detection method as described above.
The embodiment of the application has the following beneficial effects:
the method comprises the steps of dividing an obtained process operation behavior sequence to obtain a plurality of sequences to be detected, so that serialization processing of user behaviors is achieved; extracting features based on the attribute value of the field information of each process operation behavior in each sequence to be detected to obtain a feature vector corresponding to the corresponding sequence to be detected, so that the extracted feature vector is related to the field information of the process operation behavior; then carrying out anomaly detection on the characteristic vectors corresponding to the sequences to be detected to obtain abnormal characteristic vectors; by calculating the abnormal value of each dimension in each abnormal feature vector, the abnormal feature item in the abnormal feature vector can be determined, so that the related technical personnel can analyze the abnormal feature item conveniently, and the feature vector is determined to be the reason of the abnormal feature vector; according to the method and the device, the multi-dimensional feature data can be integrated, and the abnormal feature vectors corresponding to the abnormal behavior sequences are automatically detected based on the multi-dimensional feature data, so that the accuracy and the efficiency of detecting the abnormal behaviors of the user are improved.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an implementation environment provided by an embodiment of the present application;
fig. 2 is a flowchart of an abnormal behavior detection method provided in an embodiment of the present application;
fig. 3 is a flowchart of a method for generating a sequence of process operation behaviors according to an embodiment of the present application;
fig. 4 is a flowchart of a feature extraction method provided in an embodiment of the present application;
fig. 5 is a flowchart of a feature rule matching method provided in an embodiment of the present application;
FIG. 6 is a flowchart of an abnormality degree calculation method according to an embodiment of the present application;
fig. 7 is a flowchart of a suspicious behavior sequence detection method according to an embodiment of the present application;
fig. 8 is a schematic diagram of an abnormal behavior detection apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an apparatus according to an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, the present application will be further described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, a schematic diagram of an implementation environment provided by an embodiment of the present application is shown, where the implementation environment may include: at least a first terminal 110 and a second terminal 120, said first terminal 110 and said second terminal 120 being capable of data communication over a network.
Specifically, the user operates the first terminal 110, and the first terminal 110 generates corresponding user log data according to the user operation behavior, and stores the user log data in the first terminal 110; the second terminal 120 acquires the user log data from the first terminal 110, performs user behavior detection on the corresponding user based on the user log data, and outputs a detection result; the first terminal 110 may actively send the user log data to the second terminal 120 at a preset frequency, so that the second terminal 120 performs user behavior detection on the corresponding user based on the user log data.
The first terminal 110 may communicate with the second terminal 120 based on a Browser/Server mode (Browser/Server, B/S) or a Client/Server mode (Client/Server, C/S). The first terminal 110 may include: the physical device may also include software running in the physical device, such as an application program. The operating system running on the first terminal 110 in this embodiment of the present application may include, but is not limited to, an android system, an IOS system, linux, windows, and the like.
The second terminal 120 and the first terminal 110 may establish a communication connection through a wired or wireless connection, and the second terminal 120 may include an independently operating server, or a distributed server, or a server cluster composed of multiple servers, where the server may be a cloud server.
In order to solve the problems of low detection accuracy and low efficiency in the process of detecting the abnormal behavior of the user in the prior art, an embodiment of the present application provides an abnormal behavior detection method, where an execution subject may be a second terminal in fig. 1, and specifically, please refer to fig. 2, the method may include:
s210, acquiring a process operation behavior sequence, dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of field information.
The user process operation behavior in the embodiment of the application refers to behavior data of a series of operations performed on a process by a user, and the behavior data comprises a user read-write process, a direct or indirect process calling, a process calling through a command line or a program and the like. The process operation behavior sequence can be obtained by processing the historical operation record of the user, wherein the historical operation record can be specifically an original log of the user process operation, so that the reason for detecting whether the user is abnormal or not according to the user process operation behavior is that when the host computer is infected with viruses such as Trojan, mine excavation, lasso and the like, a series of process operations can be automatically or manually controlled, and whether the series of process operations automatically or manually controlled are abnormal or not can be detected, so that whether the user generating the series of process operations is an abnormal user or not can be detected.
Specifically, in the embodiment of the present application, a user process operation behavior may be serialized first, and then detection is performed based on a process operation behavior sequence, please refer to fig. 3, which shows a method for generating a process operation behavior sequence, where the method may include:
s310, obtaining a user process operation original log, wherein the user process operation original log comprises a plurality of user process operation behaviors.
S320, sequencing the plurality of user process operation behaviors according to a time sequence to generate the process operation behavior sequence.
The multiple user process operation behaviors are sequenced according to the time sequence, so that the generated process operation behavior sequence has a context sequence relation, and the behavior detection can be performed on the basis of the context sequence in the subsequent detection.
Because the generated process operation behavior sequence includes a large number of process operation behavior sequences, in order to facilitate detection of the process operation behavior, in the embodiment of the present application, the process operation behavior sequence may be divided into a plurality of sequences to be detected, and each sequence to be detected is detected, where the sequence to be detected may be a sequence to be detected with equal length or a sequence to be detected with unequal length.
Specifically, when the sequence to be detected is an isometric sequence to be detected, the dividing the process operation behavior sequence into a plurality of sequences to be detected includes: dividing the process operation behavior sequence by adopting a sliding window with a preset length to obtain a plurality of sequences to be detected with equal length;
when the sequence to be detected is a sequence to be detected with a non-equal length, the dividing the process operation behavior sequence into a plurality of sequences to be detected comprises: and dividing the process operation behavior sequence into a plurality of sequences to be detected with unequal length based on a preset sequence division rule. The preset sequence division rule adopted for dividing the process operation behavior sequence into a plurality of sequences to be detected with unequal lengths can be sequence division according to different time periods, namely, the user process operation behaviors in the same time period are divided into one sequence to be detected, so that the number of the user process operation behaviors in the sequence to be detected corresponding to different time periods is possibly different.
Each user process operation behavior may include a plurality of field information, which may specifically include process operation time, parent-child process name or process identifier, command line content of parent-child process, and field information of whether parent-child process is judged as a black file or a gray file, and these field information may be regarded as related information including process operation behavior characteristics, and the characteristics of the process operation behavior may be extracted subsequently according to the plurality of field information of the process operation behavior.
S220, for each sequence to be detected, feature extraction is carried out on the basis of attribute values of the plurality of field information of each process operation behavior in the sequence to be detected, and a feature vector corresponding to the sequence to be detected is obtained.
Each sequence to be detected includes a plurality of user process operation behaviors, and in order to determine the features of the corresponding sequence to be detected, feature extraction may be performed on field information of each process operation behavior, so as to obtain a feature vector corresponding to the sequence to be detected, specifically, referring to fig. 4, which shows a feature extraction method, where the method may include:
s410, obtaining an initial feature vector, wherein the number of dimensions of the initial feature vector is equal to the number of preset feature items, and each dimension of the feature vector corresponds to one preset feature item.
The initial feature vector in the embodiment of the present application may specifically be a multidimensional vector, and the dimension of the vector is equal to the preset number of feature items, and specifically, the value of each dimension of the multidimensional vector may be set to 0 during initialization.
And S420, traversing each process operation behavior in the sequence to be detected.
Because each sequence to be detected contains a plurality of process operation behaviors, each process operation behavior needs to be traversed in sequence, and specifically, the following operations need to be executed on each process operation behavior:
s430, for the current process operation behavior, matching the attribute values of the field information in the current process operation behavior with a preset characteristic rule respectively; the preset feature rules comprise a plurality of feature rules corresponding to each preset feature item.
According to the above content of this embodiment, the field information of each process operation behavior may include a name or an identifier of a parent process and a child process, the content of a command line of the parent process and field information of whether the parent process is judged to be a black file or a gray file, and the like, and the attribute values corresponding to the field information are matched with the preset feature rules, where the number of the feature rules included in the preset feature rules is equal to the number of the preset feature items, and when there is a match between the attribute value of a certain field information and the feature rules, it is indicated that the current process operation behavior has the preset feature corresponding to the feature rules.
For example, the parent process name of the process operation behavior a is a1, the child process name is a2, the content of the parent process command line is b1, the child process command behavior b2, and neither the parent process nor the child process is a black file or a gray file, wherein whether the process is a black file or a gray file can be detected and identified through a sandbox.
Correspondingly, the attribute information can be matched with a preset feature rule, and the embodiment of the application provides the following feature rule:
(1) parent process names are uncommon and suspect
(2) Child process names are uncommon and suspect
(3) The parent process names are uncommon and all numbers and the number of numbers exceeds 10
(4) Sub-process names are uncommon and all numbers and numbers exceed 10
(5) The parent process name includes only capital letters, numbers and preset symbols
(6) The child process name includes only capital letters, numbers and preset symbols
(7) The parent process name first character is a special character (except the alphanumeric sum) or the extension contains a special character or the total number of special characters exceeds 5
(8) The first character of subprocess is special character or extension containing special character or total number of special characters is more than 5
(9) All the parent-child process names have any one of the above conditions (the parent-child process names are abnormal, the conditions are few and suspicious)
(10) The parent process is black md5 (most of the cases, the number of black md5 is small and suspicious)
(11) Child process is black md5
(12) Parent-child processes md5 are all gray
(13) Parent-child process md5 one being black and one being grey
(14) The parent process name is suspect and md5 is gray (in most cases md5 gray and the name suspect do not appear simultaneously)
(15) Child Process name suspect and md5 is gray
(16) The parent process cmd contains suspect strings (in most cases, cmd does not contain suspect strings, where suspect strings refer to regular expressions like ' cmstp.exe. inf ', ' hh.exe. chm ', etc. ' hundreds of suspect cmd commands, as summarized by professional security operators)
(17) Sub-process cmd contains suspect strings
(18) Parent and child processes cmd have a specific relationship (normally, no specific relationship exists, where a specific relationship refers to a context command line relationship of a suspicious process summarized by a professional security operator)
(19) The parent process is a black md5 process and has a large breadth (in most cases, a black md5 process has a small breadth, which is the access amount of the process)
(20) The child process is black md5 and has a larger breadth
(21) The parent process is gray md5 and has a greater breadth (in most cases gray md5 has a lesser breadth)
(22) The child process is gray md5 and has a larger breadth
The attribute values of the field information of the process operation behavior a are sequentially matched with the feature rules, which feature rules of the feature rules the process operation behavior a is successfully matched with can be determined, for example, after matching, the process operation behavior a is determined to be matched with (1), (2), (16) and (17) of the feature rules, so that a feature rule matching result of the process operation behavior a is obtained. For other process operation behaviors in the current sequence to be detected, the feature rule matching result of each process operation behavior can be obtained by adopting the operation.
Each feature rule corresponds to each preset feature item, and each preset feature item corresponds to each dimension in the feature vector.
The feature rules provided above are only exemplary, and other feature rules may be provided, and the embodiments of the present application are not particularly limited.
And S440, updating the current initial feature vector based on the matching result of the attribute value of each field information in the current process operation behavior and a preset feature rule.
In a specific implementation process, matching between an attribute value and a feature rule may be performed in units of a process operation behavior, specifically referring to fig. 5, which shows a feature rule matching method, where the method may include:
and S510, traversing the attribute values of the information of each field in the current process operation behavior.
S520, if the attribute value of the current field information is matched with at least one feature rule in the preset feature rules, determining the preset feature item corresponding to the feature rule which is successfully matched as a target feature item.
S530, accumulating the dimension values corresponding to the target feature items.
Further, taking the process operation behavior a as an example, the parent process name of the process operation behavior a is a1, and the parent process name is matched with (1) in the feature rule, so as to determine that the feature item 1 corresponding to the feature rule (1) is the target feature item, and add one to the value of the dimension corresponding to the current feature item 1, where the value of the dimension corresponding to the current feature item 1 may be 0 or may not be 0, and only needs to add one directly on the basis of the current value.
And S540, after traversing the attribute values of the field information in the current process operation behavior, updating the current initial feature vector based on the accumulation result of the dimension values corresponding to the target feature items.
Therefore, each time a process operation behavior is traversed, the current initial vector can be updated once, specifically, after the attribute values of each field information are traversed, corresponding values exist under the corresponding dimensionality of each feature item, the values are obtained by accumulating the attribute values of each field information in the current process operation behavior in the traversing process, each time the attribute value of one field information is traversed, a corresponding target feature item can be matched, and the value of the dimensionality corresponding to the target feature item can be accumulated. Updating the current initial characteristic vector based on the value of the dimensionality corresponding to each current characteristic item, so as to obtain an updated initial characteristic vector; the updated initial feature vector can be used as an initial feature vector when the next process operation behavior is processed, that is, in the process of processing the next process operation behavior, the initial feature vector is updated on the basis of the initial feature vector obtained after the current process operation behavior is processed.
For example, before any process operation behavior is not traversed, the matching success number corresponding to each feature item is 0, and the value of each dimension in the corresponding initial feature vector is also 0, as shown in table 1:
TABLE 1
Figure BDA0002526915770000101
Wherein, the serial numbers of the feature items are shown in an ellipsis form, the matching success number of the corresponding feature items is assumed to be 0, and after the traversal of the process operation behavior a is completed, the process operation behavior a is determined to be matched with (1), (2), (16) and (17) in the feature rule, so as to obtain table 2:
TABLE 2
Figure BDA0002526915770000102
Therefore, when a certain feature rule is successfully matched with the process operation behavior, the number of successful matching of the feature rule corresponding to the feature rule is increased by 1.
When the set initial vector updating rule is that the initial feature vector is updated once every process operation behavior is traversed, the initial feature vector is updated to (1, 1 … 1, 1 … 0) from the original (0, 0 … 0, 0); after the process operation B is traversed, the number of successful matches in traversing the process operation B is added to the table 2, for example, the process operation B is determined to be matched with (2), (16), and (22) in the feature rule, so that the table 2 is accumulated to obtain table 3:
TABLE 3
Figure BDA0002526915770000103
Thus, the current initial feature vector is updated from (1, 1 … 1, 1 … 0) to (1, 2 … 2, 1 … 1).
S450, after traversing of the operation behaviors of each process in the sequence to be detected is completed, determining that the current initial feature vector is the feature vector corresponding to the sequence to be detected.
In the process of traversing each process operation behavior, the initial characteristic vector is in a ceaseless updating state, and the corresponding initial characteristic vector is updated once when each process operation behavior is processed; and when the operation behaviors of all the processes are traversed, obtaining an initial characteristic vector which is the characteristic vector corresponding to the sequence to be detected.
Based on the above content of the embodiment of the present application, if the current sequence to be detected only includes two process operation behaviors, i.e., the process operation behavior a and the process operation behavior B, the obtained vector (1, 2 … 2, 1 … 1) is the feature vector corresponding to the current sequence to be detected.
In addition, the current initial feature vector is updated in the process of traversing the process operation behaviors, or the current initial feature vector can be updated once in a unified way when each process operation behavior in a current sequence to be detected is completed by traversing, namely, only the number of feature rules which are successfully matched correspondingly is accumulated when each process operation behavior is traversed, and finally, the current initial feature vector is updated once based on the accumulated result, so that the feature vector corresponding to the current sequence to be detected is generated. For example, assuming that the sequence to be detected only includes two process operation behaviors, i.e., a process operation behavior a and a process operation behavior B, it can be known from the above that traversing a first and then traversing B, or traversing B first and then traversing a, and the finally obtained matching results are all shown in table 3, at this time, the initial feature vector can be directly updated from the original (0, 0 … 0, 0) to (1, 2 … 2, 1 … 1), so that the feature vector (1, 2 … 2, 1 … 1) is the feature vector corresponding to the current sequence to be detected.
In this embodiment of the application, for a feature vector corresponding to a sequence to be detected, a value of each dimension of the feature vector corresponds to a number of some feature included in the sequence to be detected, for example, for a value of a first dimension of the feature vector, it may correspond to a first item of the matching rule, that is, a number of process operation behaviors in which a parent process name in the sequence to be detected is unusual and suspicious; for the value of the sixth dimension, it may correspond to the sixth item of the matching rule, that is, the sub-process name in the sequence to be detected includes only the number of process operation behaviors of capital letters, numbers and preset symbols, and the meaning of the values of other dimensions in the feature vector is analogized.
And S230, carrying out anomaly detection on the characteristic vectors corresponding to the sequences to be detected, and determining the characteristic vectors detected as anomalies as abnormal characteristic vectors.
When the sequence to be detected is a sequence to be detected with equal length, performing anomaly detection on the characteristic vector by adopting an unsupervised detection algorithm or a supervised detection algorithm, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector;
and when the sequence to be detected is a sequence to be detected with unequal length, performing anomaly detection on the characteristic vector by adopting a supervised detection algorithm, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector.
In the specific implementation process, in order to eliminate the influence of different sequence lengths on the detection result of the unsupervised detection algorithm, the unsupervised detection algorithm can be applied to detection of equally long sequences to be detected, and the supervised detection algorithm can be applied to detection of equally long sequences to be detected and detection of non-equally long sequences to be detected.
And S240, determining the abnormal degree of each dimension feature in the abnormal feature vector.
The abnormality degree of each dimension feature in the embodiment of the present application may be used to characterize the abnormality degree of each dimension feature, and an abnormality feature vector is determined based on the above abnormality detection algorithm, but in order to further determine which abnormal feature or abnormal features cause the abnormality of the feature vector, the embodiment of the present application provides a method for determining an abnormality feature according to the abnormality degree calculation, and in particular, refer to fig. 6, which shows an abnormality degree calculation method, and the method may include:
s610, magnitude conversion is carried out on values corresponding to all dimension features in the abnormal feature vector, and a standardized value corresponding to all dimension features in the abnormal feature vector is obtained.
And S620, respectively determining the normalized value corresponding to each dimension characteristic as the abnormal degree of each dimension characteristic.
The calculation of the degree of abnormality is carried out in order to convert data of different magnitudes into the same magnitude, and the data of the same magnitude is used for measuring the characteristics of all dimensions, so that the comparability between the data can be ensured.
And S250, determining abnormal feature items in the abnormal feature vector based on the abnormal degree of each dimension feature in the abnormal feature vector.
The abnormal feature vector is detected to have one or more feature items with abnormal degree larger than the normal value, and the features with abnormal degree larger than the normal value can be used as the reason for explaining the corresponding abnormal feature items as abnormal.
For example, if the degree of abnormality of the feature item corresponding to the third dimension of the abnormal feature vector v is calculated to be greater than the normal value, it indicates that the abnormal feature vector v is determined as an abnormal feature vector because "the parent process names are unusual and are all numbers, and the number of process operation behaviors with numbers exceeding 10" is obtained.
The method comprises the steps of dividing an obtained process operation behavior sequence to obtain a plurality of sequences to be detected, so that serialization processing of user behaviors is achieved; extracting features based on the attribute value of the field information of each process operation behavior in each sequence to be detected to obtain a feature vector corresponding to the corresponding sequence to be detected, so that the extracted feature vector is related to the field information of the process operation behavior; then carrying out anomaly detection on the characteristic vectors corresponding to the sequences to be detected to obtain abnormal characteristic vectors; by calculating the abnormal value of each dimension in each abnormal feature vector, the abnormal feature item in the abnormal feature vector can be determined, so that the related technical personnel can analyze the abnormal feature item conveniently, and the feature vector is determined to be the reason of the abnormal feature vector; according to the method and the device, the multi-dimensional feature data can be integrated, and the abnormal feature vectors corresponding to the abnormal behavior sequences are automatically detected based on the multi-dimensional feature data, so that the accuracy and the efficiency of detecting the abnormal behaviors of the user are improved.
In a specific implementation process, an isolated forest algorithm can be adopted for the unsupervised detection algorithm, and the isolated forest modeling needs to satisfy the principle that abnormal data only account for a small amount and the characteristic value of the abnormal data is greatly different from a normal value, and the 22 extracted characteristics all satisfy the precondition that the number of times of occurrence of suspicious data is small, so that the isolated forest algorithm is in fit in the scene. The basic principle of the isolated forest algorithm is that m pieces of data are taken out randomly to serve as training samples, one feature is selected randomly in the samples, one value is selected randomly in the range of feature values, then the features are divided, the features are selected repeatedly until the features cannot be divided, or the height of a tree reaches a certain height, the height can be limited by the tree, and the prediction can be carried out after the construction is completed. When prediction is carried out, the test data is recorded according to characteristic conditions from the tree to the end, and the path length and the number of the edges are recorded. Since the outliers are generally very rare, they will be quickly divided into leaf nodes in the tree, i.e. the path length will be significantly shorter; due to the random selection property, the results of multiple trees need to be combined to obtain the final predicted result. Before prediction, an isolated forest algorithm is adopted to train a model, the model comprises a plurality of trained trees, and each node of each tree represents a different splitting characteristic; the model ultimately used for anomaly detection is a model based on fusion of multiple trees.
The degree of abnormality of each dimension feature in the embodiment of the application can be calculated by a z-score (z-score) method, wherein the z-score is also called a standard score, and is a process of dividing the difference between a number and a mean by a standard deviation, and a specific formula of the process is as follows:
Figure BDA0002526915770000131
wherein, X is the original data,
Figure BDA0002526915770000132
the average value is the average value, s is the standard deviation, the z-score of each dimension numerical value in the abnormal feature vector can be calculated through the formula (1), the z-score corresponding to each dimension is compared with a preset value, and when the value of the z-score corresponding to a certain dimension is larger than the preset value, the feature corresponding to the dimension can be indicated as the reason causing the abnormality of the current feature vector.
Referring to fig. 7, a suspicious behavior sequence detection method is shown, which can be executed by a background server, and specifically includes the following steps:
and S710, acquiring an original log of a user operation process in a preset time period to be tested.
And S720, grouping the original log data according to the user and sequencing the original log data according to time.
And S730, dividing the complete sequence of each user into the original sequence to be detected.
The method comprises the steps of obtaining a user process operation data log in a preset time period, such as 1 hour, wherein the user process operation data log comprises operation data such as a user read-write process, a direct or indirect calling process, a calling process through a command line or a program and the like, and each user has a unique identifier guid. In order to detect the suspicious behavior sequence of each user, the data to be detected needs to be in units of users, and the behavior sequence needs to be sorted according to time.
Because the behavior sequences of the users are definitely different in length, the behavior sequence of each user can be segmented by adopting a sliding window setting mode, so that the original sequence to be detected with equal length is obtained. The sliding window can be set according to actual conditions, for example, if the sliding window is set to 10, the sliding window represents that data is operated every 10 processes, and a next group of sequences to be detected is generated. The reason why the original sequences to be detected are of equal length is that when an unsupervised anomaly detection algorithm is adopted, the influence of the length of the original sequences needs to be eliminated. Assuming that the length is set to 20, sequences of length less than 20 are automatically dropped. However, it is not necessary to specify that the original sequences to be detected are of equal length, because if a supervised algorithm is used, the detection result will not be affected by the length of the original sequence in theory.
And S740, performing feature extraction on the sequence to be detected to generate a feature vector.
And S750, detecting the extracted feature vector by using an anomaly detection algorithm to obtain an anomaly feature vector.
Taking the generation of original sequencing under investigation of equal length as an example, assuming that each sequence is 20 in length, it is necessary to extract features in units of 20 process operation lines per group. Each of the 20 process operation behaviors represents the behavior of the user operating the process, and the behavior mainly comprises fields of names of the parent and child processes, whether the parent and child processes are judged to be black files or gray files, the content of a command line of the parent and child processes and the like. Therefore, the feature extraction of 20 behavior sequences of each group requires the design of features of each group of sequences. Since the fact of detection by the anomaly detection algorithm is less characteristic data, the precondition of the data is that less data is anomalous, and therefore the adopted characteristics must also satisfy the precondition that less of the characteristics are suspicious, so that 22 characteristics corresponding to the above 22 characteristics can be set, which is not described herein again.
The abnormal behavior sequence is detected by an algorithm, the adopted abnormal detection algorithm is an unsupervised algorithm, the algorithm principle determines that the input characteristics meet the condition that the occurrence frequency of suspicious data is small, and the selected characteristics are limited; if a batch of labeled data can be known and then a supervised algorithm such as a classification algorithm is used, more features can be introduced and more unknown rules can be mined.
And S760, calculating a z-fraction value of each feature in the abnormal feature vector, and taking the z-fraction value as a reason for explaining the abnormality of the corresponding sequence.
The feature vectors of the abnormal behavior sequence every hour can be detected through an isolated forest algorithm, but the fact that the abnormal feature vectors are detected because of one or more abnormal features is not known, so that all vectors can be converted by adopting a z-score standardization method, the main purpose of the z-score is to uniformly convert data with different magnitudes into the same magnitude, and the data are uniformly measured by using a calculated z-score value to ensure comparability of the data. That is, the detected abnormal vector necessarily has one or more features with a larger z-score value than normal, and these features can be used as reasons for explaining the vector abnormality.
And S770, screening abnormal users with the quantity of the detected abnormal feature vectors larger than a threshold value in a preset time period.
And S780, extracting original sequences to be detected corresponding to all abnormal characteristic vectors of the abnormal users as analysis data.
And S790, summarizing abnormal characteristics of the abnormal users as analysis data.
In order to ensure the accuracy of the detected abnormal sequence, in the embodiment of the application, screening is performed according to the number of the abnormal feature vectors of each user, a user with the number of the abnormal feature vectors being larger than n per hour is selected as an abnormal user, where n can be a value according to an actual situation, for example, 5, and then all the abnormal feature vectors corresponding to the abnormal users are summarized. The abnormal feature vector is a vector obtained after extracting features of an original sequence to be detected, although the corresponding z-score value explains the reason of the abnormality, safety operators need to check the original sequence to be detected when analyzing, therefore, in a program, a unique identifier is generated for each group of original sequence to be detected and the corresponding feature vector to connect the abnormal feature vector and the corresponding original sequence to be detected, and therefore all original sequences to be detected corresponding to the abnormal feature vector of an abnormal user can be extracted as analysis data. Because the abnormal user corresponds to n abnormal feature vectors, the z-score abnormal values corresponding to the n feature vectors are summarized and extracted, and all abnormal features of the abnormal user can be obtained for analysis and mining.
The whole suspicious behavior sequence detection method is based on the original process operation log, other special processing is not performed from the original log to the detection, in another implementation mode, more prior knowledge or model prediction information such as TTPs technical points and the like can be added into the middle layer, and the detection result is possibly more beneficial.
In the specific implementation process, because the operation data volume of the user process per hour is very large, about hundreds of millions, the calculation amount of the sequence generation module to be detected with equal length is large, and the feature extraction part has operations such as regular matching, which are complex and time-consuming, all the steps of the method can be realized by spark, which is a fast and general calculation engine specially designed for large-scale data processing.
The embodiment of the application provides a method for detecting suspicious behavior sequences of user process data, which comprises the steps of firstly grouping and sequencing all behavior sequences of each user, and dividing the behavior sequences of each user into a plurality of original sequences to be detected with equal length through a sliding window; then, taking the equal-length behavior sequence to be detected of each user as a unit, extracting the characteristics of the behavior sequence to be detected, and detecting the abnormal sequence in the behavior sequence by using an abnormal detection algorithm based on the characteristics; then calculating z-score scores of the features in the abnormal feature vectors by a z-score method to explain the abnormal reasons of the abnormal sequences; screening out users with the number of abnormal sequences larger than a threshold value corresponding to the users as abnormal users, extracting all original sequences to be detected corresponding to all abnormal feature sequences of the abnormal users as analysis data of the abnormal users, and extracting all abnormal features of the abnormal users for auxiliary analysis.
The suspicious behavior sequence detection and analysis method provided by the embodiment of the application can directly detect the original user process operation data without any treatment such as summarization and refinement on the original data, can integrate multi-dimensional data, is based on context sequence detection, automatically discovers suspicious sequences by adopting an anomaly detection algorithm, and is convenient to implement and high in reliability; and simultaneously screening out suspicious users, and converging all original suspicious process operation behaviors of the suspicious users and corresponding abnormal characteristics of the suspicious users so as to perform abnormal sequence analysis. More accurate suspicious process operation behaviors of users are automatically found through a machine learning algorithm, and original data of suspicious operations and contexts of the suspicious operations are gathered to assist security operators to analyze so as to find more unknown threats and security events.
The present embodiment further provides an abnormal behavior detection apparatus, please refer to fig. 8, the apparatus may include:
a behavior sequence dividing module 810, configured to obtain a process operation behavior sequence, and divide the process operation behavior sequence into a plurality of sequences to be detected, where each sequence to be detected includes a plurality of user process operation behaviors, and each user process operation behavior includes a plurality of field information;
a feature extraction module 820, configured to perform feature extraction on each sequence to be detected based on attribute values of the multiple pieces of field information of each process operation behavior in the sequence to be detected, so as to obtain a feature vector corresponding to the sequence to be detected;
an anomaly detection module 830, configured to perform anomaly detection on the feature vectors corresponding to the sequences to be detected, and determine that the feature vectors detected as anomalies are anomalous feature vectors;
an abnormality degree determination module 840, configured to determine an abnormality degree of each dimension feature in the abnormal feature vector;
an abnormal feature item determination module 850, configured to determine, based on the degree of abnormality of each dimensional feature in the abnormal feature vector, a feature item in the abnormal feature vector, where there is an abnormality.
Further, the feature extraction module 820 includes:
the initial feature vector acquisition module is used for acquiring initial feature vectors, the dimensionalities of the initial feature vectors are equal to the number of preset feature items, and each dimensionality of the feature vectors corresponds to one preset feature item;
the first traversal module is used for traversing each process operation behavior in the sequence to be detected;
the characteristic rule matching module is used for matching the attribute values of the field information in the current process operation behavior with the preset characteristic rules respectively for the current process operation behavior; the preset feature rules comprise a plurality of feature rules corresponding to each preset feature item;
the first updating module is used for updating the current initial characteristic vector based on the matching result of the attribute value of each field information in the current process operation behavior and a preset characteristic rule;
and the characteristic vector determining module is used for determining the current initial characteristic vector as the characteristic vector corresponding to the sequence to be detected after the traversal of the operation behaviors of each process in the sequence to be detected is completed.
Further, the first update module includes:
the second traversal module is used for traversing the attribute values of the information of each field in the current process operation behavior;
the target characteristic item determining module is used for determining a preset characteristic item corresponding to the successfully matched characteristic rule as a target characteristic item if the attribute value of the current field information is matched with at least one characteristic rule in the preset characteristic rules;
the accumulation module is used for accumulating the value of the dimensionality corresponding to the target characteristic item;
and the second updating module is used for updating the current initial feature vector based on the accumulation result of the value of the dimension corresponding to the target feature item after traversing the attribute value of each field information in the current process operation behavior.
Further, the sequence to be detected is an isometric sequence to be detected or an isometric sequence to be detected; accordingly, the behavior sequence partitioning module 810 includes:
the first dividing module is used for dividing the process operation behavior sequence by adopting a sliding window with a preset length to obtain a plurality of equilong sequences to be detected when the sequences to be detected are equilong sequences to be detected;
and the second dividing module is used for dividing the process operation behavior sequence into a plurality of non-isometric sequences to be detected based on a preset sequence dividing rule when the sequences to be detected are non-isometric sequences to be detected.
Further, the anomaly detection module 830 includes:
the first detection module is used for carrying out anomaly detection on the characteristic vector by adopting an unsupervised detection algorithm or a supervised detection algorithm when the sequence to be detected is a sequence to be detected with equal length, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector;
and the second detection module is used for performing anomaly detection on the characteristic vector by adopting a supervised detection algorithm when the sequence to be detected is a sequence to be detected with non-equal length, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector.
Further, the abnormality degree determination module 840 includes:
the numerical value conversion module is used for carrying out magnitude conversion on the value corresponding to each dimension feature in the abnormal feature vector to obtain a standardized value corresponding to each dimension feature in the abnormal feature vector;
and the first determining module is used for respectively determining the normalized value corresponding to each dimension characteristic as the abnormal degree of each dimension characteristic.
Further, the apparatus further comprises:
the system comprises an original log obtaining module, a log obtaining module and a log analyzing module, wherein the original log obtaining module is used for obtaining an original log of user process operation, and the original log of user process operation comprises a plurality of user process operation behaviors;
and the behavior sequence generation module is used for sequencing the plurality of user process operation behaviors according to a time sequence to generate the process operation behavior sequence.
The device provided in the above embodiments can execute the method provided in any embodiment of the present application, and has corresponding functional modules and beneficial effects for executing the method. Technical details not described in detail in the above embodiments may be referred to a method provided in any of the embodiments of the present application.
The present embodiment also provides a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded by a processor and executes any one of the methods described in the present embodiment.
Referring to fig. 9, the device 900 may have a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 922 (e.g., one or more processors) and a memory 932, and one or more storage media 930 (e.g., one or more mass storage devices) storing applications 942 or data 944. The memory 932 and the storage medium 930 may be, for example, a transitory memory or a persistent memory. The program stored on the storage medium 930 may include one or more modules (not shown), each of which may include a sequence of instructions operating on the device. Still further, the central processor 922 may be arranged to communicate with the storage medium 930, and execute on the device 900 that is in the storage medium 930A series of instruction operations. The device 900 may also include one or more power supplies 926, one or more wired or wireless network interfaces 950, one or more input-output interfaces 958, and/or one or more operating systems 941, such as a Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on. Any of the methods described above in this embodiment can be implemented based on the apparatus shown in fig. 9.
The present specification provides method steps as described in the examples or flowcharts, but may include more or fewer steps based on routine or non-inventive labor. The steps and sequences recited in the embodiments are but one manner of performing the steps in a multitude of sequences and do not represent a unique order of performance. In the actual system or interrupted product execution, it may be performed sequentially or in parallel (e.g., in the context of parallel processors or multi-threaded processing) according to the embodiments or methods shown in the figures.
The configurations shown in the present embodiment are only partial configurations related to the present application, and do not constitute a limitation on the devices to which the present application is applied, and a specific device may include more or less components than those shown, or combine some components, or have an arrangement of different components. It should be understood that the methods, apparatuses, and the like disclosed in the embodiments may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a division of one logic function, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or unit modules.
Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims (10)

1. An abnormal behavior detection method, comprising:
acquiring a process operation behavior sequence, dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of pieces of field information;
for each sequence to be detected, extracting features based on attribute values of the plurality of fields of information of each process operation behavior in the sequence to be detected to obtain a feature vector corresponding to the sequence to be detected;
carrying out anomaly detection on the feature vectors corresponding to the sequences to be detected, and determining the feature vectors detected as anomalies as anomaly feature vectors;
determining the abnormal degree of each dimension feature in the abnormal feature vector;
and determining the abnormal feature items in the abnormal feature vector based on the abnormality degree of each dimension feature in the abnormal feature vector.
2. The abnormal behavior detection method according to claim 1, wherein the performing feature extraction based on the attribute values of the plurality of field information of each process operation behavior in the sequence to be detected to obtain the feature vector corresponding to the sequence to be detected comprises:
acquiring initial feature vectors, wherein the dimensionalities of the initial feature vectors are equal to the number of preset feature items, and each dimensionality of the feature vectors corresponds to one preset feature item;
traversing each process operation behavior in the sequence to be detected;
for a current process operation behavior, matching attribute values of field information in the current process operation behavior with a preset characteristic rule respectively; the preset feature rules comprise a plurality of feature rules corresponding to each preset feature item;
updating the current initial feature vector based on the matching result of the attribute value of each field information in the current process operation behavior and a preset feature rule;
and after traversing each process operation behavior in the sequence to be detected, determining the current initial characteristic vector as the characteristic vector corresponding to the sequence to be detected.
3. The abnormal behavior detection method according to claim 2, wherein the updating the current initial feature vector based on the matching result of the attribute value of each field information in the current process operation behavior and the preset feature rule comprises:
traversing attribute values of each field information in the current process operation behavior;
if the attribute value of the current field information is matched with at least one feature rule in the preset feature rules, determining a preset feature item corresponding to the feature rule which is successfully matched as a target feature item;
accumulating the dimension values corresponding to the target feature items;
and after traversing the attribute values of the field information in the current process operation behavior, updating the current initial feature vector based on the accumulation result of the dimension values corresponding to the target feature item.
4. The abnormal behavior detection method according to claim 1, wherein the sequence to be detected is an isometric sequence to be detected, or an isometric sequence to be detected;
when the sequence to be detected is an isometric sequence to be detected, the step of dividing the process operation behavior sequence into a plurality of sequences to be detected comprises the following steps:
dividing the process operation behavior sequence by adopting a sliding window with a preset length to obtain a plurality of sequences to be detected with equal length;
when the sequence to be detected is a sequence to be detected with a non-equal length, the dividing the process operation behavior sequence into a plurality of sequences to be detected comprises:
and dividing the process operation behavior sequence into a plurality of sequences to be detected with unequal length based on a preset sequence division rule.
5. The abnormal behavior detection method according to claim 4, wherein the performing abnormality detection on the eigenvectors corresponding to each sequence to be detected and determining the eigenvectors detected as abnormal eigenvectors comprises:
when the sequence to be detected is a sequence to be detected with equal length, performing anomaly detection on the characteristic vector by adopting an unsupervised detection algorithm or a supervised detection algorithm, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector;
when the sequence to be detected is a sequence to be detected with unequal length, performing anomaly detection on the characteristic vector by adopting a supervised detection algorithm, and determining the characteristic vector detected as anomaly as an abnormal characteristic vector;
wherein the unsupervised detection algorithm is an isolated forest algorithm.
6. The abnormal behavior detection method according to claim 1, wherein the determining the degree of abnormality of each dimensional feature in the abnormal feature vector comprises:
performing magnitude conversion on values corresponding to the dimension features in the abnormal feature vector to obtain normalized values corresponding to the dimension features in the abnormal feature vector;
and respectively determining the normalized value corresponding to each dimension characteristic as the abnormal degree of each dimension characteristic.
7. The abnormal behavior detection method according to claim 1, wherein the obtaining of the sequence of process operation behaviors further comprises:
acquiring an original log of user process operation, wherein the original log of user process operation comprises a plurality of user process operation behaviors;
and sequencing the plurality of user process operation behaviors according to the time sequence to generate the process operation behavior sequence.
8. An abnormal behavior detection apparatus, comprising:
the behavior sequence dividing module is used for acquiring a process operation behavior sequence and dividing the process operation behavior sequence into a plurality of sequences to be detected, wherein each sequence to be detected comprises a plurality of user process operation behaviors, and each user process operation behavior comprises a plurality of pieces of field information;
the characteristic extraction module is used for extracting characteristics of each sequence to be detected based on the attribute values of the plurality of fields of information of each process operation behavior in the sequence to be detected to obtain a characteristic vector corresponding to the sequence to be detected;
the anomaly detection module is used for carrying out anomaly detection on the characteristic vectors corresponding to the sequences to be detected and determining the characteristic vectors detected as anomalies as anomaly characteristic vectors;
the abnormal degree determining module is used for determining the abnormal degree of each dimension characteristic in the abnormal characteristic vector;
and the abnormal feature item determining module is used for determining the feature items with the abnormality in the abnormal feature vector based on the abnormality degree of each dimension feature in the abnormal feature vector.
9. An abnormal behavior detection apparatus, comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the abnormal behavior detection method according to any one of claims 1 to 7.
10. A computer storage medium, characterized in that at least one instruction or at least one program is stored in the storage medium, and the at least one instruction or the at least one program is loaded by a processor and executes the abnormal behavior detection method according to any one of claims 1 to 7.
CN202010507008.2A 2020-06-05 2020-06-05 Abnormal behavior detection method, device, equipment and storage medium Pending CN111651767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010507008.2A CN111651767A (en) 2020-06-05 2020-06-05 Abnormal behavior detection method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010507008.2A CN111651767A (en) 2020-06-05 2020-06-05 Abnormal behavior detection method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111651767A true CN111651767A (en) 2020-09-11

Family

ID=72349810

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010507008.2A Pending CN111651767A (en) 2020-06-05 2020-06-05 Abnormal behavior detection method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111651767A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149749A (en) * 2020-09-29 2020-12-29 北京明朝万达科技股份有限公司 Abnormal behavior detection method and device, electronic equipment and readable storage medium
CN112364284A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method, device and related product for detecting abnormity based on context
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN112883368A (en) * 2021-03-08 2021-06-01 网易(杭州)网络有限公司 Abnormal process detection method and device, storage medium and electronic equipment
CN113127320A (en) * 2021-04-08 2021-07-16 支付宝(杭州)信息技术有限公司 Application program abnormity detection method, device, equipment and system
CN113420652A (en) * 2021-06-22 2021-09-21 中冶赛迪重庆信息技术有限公司 Method, system, medium and terminal for recognizing abnormity of time sequence signal fragment
CN113472789A (en) * 2021-06-30 2021-10-01 深信服科技股份有限公司 Attack detection method, attack detection system, storage medium and electronic equipment
CN113568836A (en) * 2021-07-30 2021-10-29 江苏易安联网络技术有限公司 Multi-time-series sample feature extraction method and software detection method applying same
CN114629696A (en) * 2022-02-28 2022-06-14 天翼安全科技有限公司 Security detection method and device, electronic equipment and storage medium
CN114896588A (en) * 2022-04-06 2022-08-12 中国电信股份有限公司 Host user abnormal behavior detection method and device, storage medium and electronic equipment
CN117009962A (en) * 2023-10-08 2023-11-07 深圳安天网络安全技术有限公司 Anomaly detection method, device, medium and equipment based on effective label

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149749A (en) * 2020-09-29 2020-12-29 北京明朝万达科技股份有限公司 Abnormal behavior detection method and device, electronic equipment and readable storage medium
CN112149749B (en) * 2020-09-29 2024-03-19 北京明朝万达科技股份有限公司 Abnormal behavior detection method, device, electronic equipment and readable storage medium
CN112364284B (en) * 2020-11-23 2024-01-30 北京八分量信息科技有限公司 Method and device for detecting abnormality based on context and related product
CN112364284A (en) * 2020-11-23 2021-02-12 北京八分量信息科技有限公司 Method, device and related product for detecting abnormity based on context
CN112733897A (en) * 2020-12-30 2021-04-30 胜斗士(上海)科技技术发展有限公司 Method and equipment for determining abnormal reason of multi-dimensional sample data
CN112883368A (en) * 2021-03-08 2021-06-01 网易(杭州)网络有限公司 Abnormal process detection method and device, storage medium and electronic equipment
CN113127320A (en) * 2021-04-08 2021-07-16 支付宝(杭州)信息技术有限公司 Application program abnormity detection method, device, equipment and system
CN113420652A (en) * 2021-06-22 2021-09-21 中冶赛迪重庆信息技术有限公司 Method, system, medium and terminal for recognizing abnormity of time sequence signal fragment
CN113472789A (en) * 2021-06-30 2021-10-01 深信服科技股份有限公司 Attack detection method, attack detection system, storage medium and electronic equipment
CN113568836A (en) * 2021-07-30 2021-10-29 江苏易安联网络技术有限公司 Multi-time-series sample feature extraction method and software detection method applying same
CN113568836B (en) * 2021-07-30 2022-09-13 江苏易安联网络技术有限公司 Multi-time-series sample feature extraction method and software detection method applying same
CN114629696A (en) * 2022-02-28 2022-06-14 天翼安全科技有限公司 Security detection method and device, electronic equipment and storage medium
CN114896588B (en) * 2022-04-06 2024-02-23 中国电信股份有限公司 Method and device for detecting abnormal behavior of host user, storage medium and electronic equipment
CN114896588A (en) * 2022-04-06 2022-08-12 中国电信股份有限公司 Host user abnormal behavior detection method and device, storage medium and electronic equipment
CN117009962B (en) * 2023-10-08 2023-12-08 深圳安天网络安全技术有限公司 Anomaly detection method, device, medium and equipment based on effective label
CN117009962A (en) * 2023-10-08 2023-11-07 深圳安天网络安全技术有限公司 Anomaly detection method, device, medium and equipment based on effective label

Similar Documents

Publication Publication Date Title
CN111651767A (en) Abnormal behavior detection method, device, equipment and storage medium
Tian et al. An automated classification system based on the strings of trojan and virus families
Baldwin et al. Leveraging support vector machine for opcode density based detection of crypto-ransomware
Islam et al. Classification of malware based on string and function feature selection
Sultana et al. Intelligent network intrusion detection system using data mining techniques
CN111475680A (en) Method, device, equipment and storage medium for detecting abnormal high-density subgraph
US8108931B1 (en) Method and apparatus for identifying invariants to detect software tampering
CN109818961B (en) Network intrusion detection method, device and equipment
CN109992969B (en) Malicious file detection method and device and detection platform
JP6714152B2 (en) Analytical apparatus, analytical method and analytical program
Moonsamy et al. Feature reduction to speed up malware classification
CN112491872A (en) Abnormal network access behavior detection method and system based on equipment image
EP3968197A1 (en) Method and system for detecting malicious files in a non-isolated environment
CN117081858B (en) Intrusion behavior detection method, system, equipment and medium based on multi-decision tree
Kumar et al. Machine learning based malware detection in cloud environment using clustering approach
KR20160119295A (en) Malware Detection Method and System Based on Hadoop
Mohan et al. Data mining classification techniques for intrusion detection system
CN112988509A (en) Alarm message filtering method and device, electronic equipment and storage medium
Nagaraja et al. An extensive survey on intrusion detection-past, present, future
Juvonen et al. Adaptive framework for network traffic classification using dimensionality reduction and clustering
CN112214768A (en) Malicious process detection method and device
Hu et al. An anomaly detection model of user behavior based on similarity clustering
Čeponis et al. Evaluation of deep learning methods efficiency for malicious and benign system calls classification on the AWSCTD
Bodík et al. HiLighter: Automatically Building Robust Signatures of Performance Behavior for Small-and Large-Scale Systems.
CN113704201A (en) Log anomaly detection method and device and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination