CN111913859A - Abnormal behavior detection method and device - Google Patents

Abnormal behavior detection method and device Download PDF

Info

Publication number
CN111913859A
CN111913859A CN202010669473.6A CN202010669473A CN111913859A CN 111913859 A CN111913859 A CN 111913859A CN 202010669473 A CN202010669473 A CN 202010669473A CN 111913859 A CN111913859 A CN 111913859A
Authority
CN
China
Prior art keywords
data
sequence
detected
abnormal
array
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010669473.6A
Other languages
Chinese (zh)
Other versions
CN111913859B (en
Inventor
陈少涵
胡跃
吴雪阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Skyguard Network Security Technology Co ltd
Chengdu Sky Guard Network Security Technology Co ltd
Original Assignee
Beijing Skyguard Network Security Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Skyguard Network Security Technology Co ltd filed Critical Beijing Skyguard Network Security Technology Co ltd
Priority to CN202010669473.6A priority Critical patent/CN111913859B/en
Publication of CN111913859A publication Critical patent/CN111913859A/en
Application granted granted Critical
Publication of CN111913859B publication Critical patent/CN111913859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Abstract

The invention discloses an abnormal behavior detection method and device, and relates to the field of data security protection. One embodiment of the method comprises: sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected; generating a sequence to be detected based on a data group to be detected, inputting the sequence to be detected into a hidden Markov model together with an initial probability matrix, a background display probability matrix and a transition probability matrix, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string; processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal. The embodiment samples, processes and evaluates the user historical behavior data for scoring to locate and detect abnormal behavior.

Description

Abnormal behavior detection method and device
Technical Field
The invention relates to the field of data security protection, in particular to an abnormal behavior detection method and device.
Background
UEBA (User Entity analysis of Behavior) has been the mainstream of data security protection since birth, and security analysis is performed by a machine learning method to detect advanced, hidden, and internal threats. User behavior modeling and anomaly detection are important components of UEBA.
At present, user behavior anomaly detection is mainly expressed as a single-dimensional statistical multidimensional combination display method and a rule-based key event behavior chain method. The method comprises the steps of firstly, carrying out single-dimensional statistics on a user behavior, carrying out single-dimensional abnormal threshold value setting on the user behavior, and carrying out weight setting and combination on multi-dimensional combinations through expert experience; the 'key event behavior chain method based on the rules' is to perform chain rule description on multidimensional characteristic behaviors in time sequence.
In the process of implementing the invention, the inventor finds that the prior art has at least the following problems:
the single-dimensional statistical multi-dimensional combined display method lacks the description capability of potential connection among multi-dimensional data, and the rule-based key event behavior chain method needs a large number of experts and safety technicians to analyze and summarize data, so that the labor cost is high.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for detecting an abnormal behavior, which can at least solve the problem that the prior art lacks a potential relationship for describing multidimensional data and consumes a large amount of labor.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided an abnormal behavior detection method including: sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected; generating a sequence to be detected based on a data group to be detected, inputting the sequence to be detected into a hidden Markov model together with an initial probability matrix, a background display probability matrix and a transition probability matrix, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string; processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
Optionally, the generating a sequence to be detected based on an array to be detected includes: marking a non-0 data mark in an array to be detected as 1 to obtain a binary string, and converting the binary string into a decimal number; sequencing and combining the decimal numbers of the arrays to be detected according to the sequence of the sampling time periods from small to large to obtain a decimal sequence; and extracting non-0 decimal numbers with a first preset length in sequence from the tail part of the decimal sequence to obtain a subsequence, and adding 0 to the head part of the subsequence to obtain the sequence to be detected.
Optionally, before the generating the sequence to be detected based on the array to be detected and inputting the sequence to be detected into the hidden markov model together with the initial probability matrix, the background visualization probability matrix and the transition probability matrix, the method further includes: for an element in the background imaging probability matrix to be assigned, converting the row number in the subscript into a row binary string and converting the column number into a column binary string; judging whether the bitwise and the result of the row binary string and the column binary string are equal to the row binary string or not, and if not, setting the element to be 0; if the background image probability matrix to be assigned is equal to the background image probability matrix to be assigned, acquiring the total order of the background image probability matrix to be assigned, subtracting 1 from the total order and converting the total order into an order binary string; wherein, the total number of data dimensions with the total order of 2 is the power of the total number; determining the bitwise and the result of the row binary string and the rank binary string, counting the number of 0 in the bitwise and the result, and judging whether the row number is equal to the column number or not; if the element number is equal to the preset background probability threshold, setting the element as a preset background probability threshold, otherwise, determining the numerical value of the element based on the preset background probability threshold and the number; and repeating the converting, judging and assigning processes to obtain the numerical values of all elements in the background imaging probability matrix to be assigned so as to construct the background imaging probability matrix.
Optionally, the processing the maximum probability hidden state character string to obtain an abnormal behavior sequence includes: removing the first character 0 in the maximum probability hidden state character string to obtain a first character string; converting characters in the first character string from decimal numbers to binary strings according to the dimension number of the plurality of data dimensions, so as to generate a first array corresponding to the characters based on the binary strings; determining first data which are not 0 in a first array, and acquiring a first data dimension and a sampling time period corresponding to the first data; inquiring an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension, and replacing the first data with the expected value to obtain a second array; and determining the array to be detected corresponding to the sampling time period, subtracting the second array from the array to be detected, taking an absolute value to obtain a third array, and combining the third arrays according to the sequence of the sampling time periods from small to large to obtain an abnormal behavior sequence.
Optionally, before querying an expected value corresponding to the sampling time period from the expected value table corresponding to the first data dimension, the method further includes: sampling historical behavior data of a sample user based on a plurality of data dimensions in the preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate a sample array; dividing the preset time length into a plurality of time periods by taking the preset time interval as a unit, determining a statistical time interval under each statistical dimension by taking one time period as a center, and acquiring data corresponding to one data dimension and one statistical time interval from the sample array; calculating the expected value of the data dimension in the time period based on the data acquired under each statistical dimension, and repeating the data acquisition and calculation operation to obtain the expected value of the data dimension in each time period; and sequencing the expected values according to the sequence of the time periods from small to large to obtain an expected value table corresponding to the data dimension.
Optionally, the statistical dimensions include point statistics, horizontal statistics, vertical statistics, and global statistics; with a time period as a center, determining a statistical time interval under each statistical dimension, comprising: the statistical time interval under the point statistics is the same time period taking the week as a unit;
the statistical time interval under the transverse statistics is the same time period taking day as a unit; the statistical time interval under the longitudinal statistics is a time period which takes a week as a unit and takes the time period as a center and longitudinally floats for a preset number; and the statistical time interval under the global statistics is the preset time length.
Optionally, calculating an expected value of the data dimension in the time period based on the data acquired under each statistical dimension includes: for a statistic dimension, calculating the average value of the acquired data, counting the total amount of the acquired data and the amount of which the value is not zero, and calculating the ratio of the amount to the total amount; when the average value under the point statistics is zero and the average value of at least one of the transverse statistics, the longitudinal statistics and the global statistics is zero, setting the expected value of the data dimension in the time period to be zero; otherwise, calculating the expected value of the data dimension in the time period based on the average value and the ratio under the point statistics, the average value and the ratio under the transverse statistics and the average value and the ratio under the global statistics.
Optionally, after the combining the obtained data to generate the sample array, the method further includes:
marking a non-0 data mark in the sample array as 1 to obtain a binary string, converting the binary string into a decimal number, and sequencing and combining the decimal number according to the sequence of sampling time periods from small to large to obtain a decimal sequence; determining a subsequence start bit in the decimal sequence by taking the first bit as 0 and the second bit as a start, extracting non-0 decimal numbers with a second preset length from the subsequence start bit backwards in sequence, and combining the subsequence start bit and the extracted decimal numbers to obtain a subsequence; and inputting the initial probability matrix, the unit background display probability matrix, the initial state transition probability matrix and the subsequence into a hidden Markov model, and processing the state transition probability matrix to be optimized through a second algorithm in the model until the model is converged to obtain the optimized state transition probability matrix.
Optionally, the calculating an abnormal score of the abnormal behavior sequence includes: acquiring abnormal behavior sequences of all users to be detected, and comparing numerical values positioned in a data dimension in all the abnormal behavior sequences to obtain a maximum value corresponding to the data dimension; dividing data in the abnormal behavior sequence by a maximum value corresponding to a corresponding data dimension to obtain a fourth array, calculating Euclidean distance between data in the fourth array, and taking the Euclidean distance as an abnormal value; determining the maximum abnormal values of all the users to be detected at this time, comparing whether the maximum abnormal values are larger than the maximum abnormal value at the last time, and taking the maximum abnormal value with a larger value or the maximum abnormal value at the last time as the maximum abnormal value at this time; and calculating the ratio of each abnormal value to the maximum abnormal value of this time to obtain the abnormal score of each user to be detected.
To achieve the above object, according to another aspect of embodiments of the present invention, there is provided an abnormal behavior detection apparatus including: the sampling module is used for sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected; the processing module is used for generating a sequence to be detected based on the array to be detected, combining the initial probability matrix, the background imaging probability matrix and the transition probability matrix, inputting the sequence to be detected into the hidden Markov model together, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string; and the detection module is used for processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
Optionally, the processing module is configured to: marking a non-0 data mark in an array to be detected as 1 to obtain a binary string, and converting the binary string into a decimal number; sequencing and combining the decimal numbers of the arrays to be detected according to the sequence of the sampling time periods from small to large to obtain a decimal sequence; and extracting non-0 decimal numbers with a first preset length in sequence from the tail part of the decimal sequence to obtain a subsequence, and adding 0 to the head part of the subsequence to obtain the sequence to be detected.
Optionally, the system further includes a background visualization probability matrix building module, configured to: for an element in the background imaging probability matrix to be assigned, converting the row number in the subscript into a row binary string and converting the column number into a column binary string; judging whether the bitwise and the result of the row binary string and the column binary string are equal to the row binary string or not, and if not, setting the element to be 0; if the background image probability matrix to be assigned is equal to the background image probability matrix to be assigned, acquiring the total order of the background image probability matrix to be assigned, subtracting 1 from the total order and converting the total order into an order binary string; wherein, the total number of data dimensions with the total order of 2 is the power of the total number; determining the bitwise and the result of the row binary string and the rank binary string, counting the number of 0 in the bitwise and the result, and judging whether the row number is equal to the column number or not; if the element number is equal to the preset background probability threshold, setting the element as a preset background probability threshold, otherwise, determining the numerical value of the element based on the preset background probability threshold and the number; and repeating the converting, judging and assigning processes to obtain the numerical values of all elements in the background imaging probability matrix to be assigned so as to construct the background imaging probability matrix.
Optionally, the detection module is configured to: removing the first character 0 in the maximum probability hidden state character string to obtain a first character string; converting characters in the first character string from decimal numbers to binary strings according to the dimension number of the plurality of data dimensions, so as to generate a first array corresponding to the characters based on the binary strings; determining first data which are not 0 in a first array, and acquiring a first data dimension and a sampling time period corresponding to the first data; inquiring an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension, and replacing the first data with the expected value to obtain a second array; and determining the array to be detected corresponding to the sampling time period, subtracting the second array from the array to be detected, taking an absolute value to obtain a third array, and combining the third arrays according to the sequence of the sampling time periods from small to large to obtain an abnormal behavior sequence.
Optionally, the system further comprises an expected value table constructing module, configured to: sampling historical behavior data of a sample user based on a plurality of data dimensions in the preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate a sample array; dividing the preset time length into a plurality of time periods by taking the preset time interval as a unit, determining a statistical time interval under each statistical dimension by taking one time period as a center, and acquiring data corresponding to one data dimension and one statistical time interval from the sample array; calculating the expected value of the data dimension in the time period based on the data acquired under each statistical dimension, and repeating the data acquisition and calculation operation to obtain the expected value of the data dimension in each time period; and sequencing the expected values according to the sequence of the time periods from small to large to obtain an expected value table corresponding to the data dimension.
Optionally, the statistical dimensions include point statistics, horizontal statistics, vertical statistics, and global statistics; the expected value table building module is configured to: the statistical time interval under the point statistics is the same time period taking the week as a unit;
the statistical time interval under the transverse statistics is the same time period taking day as a unit; the statistical time interval under the longitudinal statistics is a time period which takes a week as a unit and takes the time period as a center and longitudinally floats for a preset number; and the statistical time interval under the global statistics is the preset time length.
Optionally, the expected value table constructing module is configured to: for a statistic dimension, calculating the average value of the acquired data, counting the total amount of the acquired data and the amount of which the value is not zero, and calculating the ratio of the amount to the total amount; when the average value under the point statistics is zero and the average value of at least one of the transverse statistics, the longitudinal statistics and the global statistics is zero, setting the expected value of the data dimension in the time period to be zero; otherwise, calculating the expected value of the data dimension in the time period based on the average value and the ratio under the point statistics, the average value and the ratio under the transverse statistics and the average value and the ratio under the global statistics.
Optionally, the system further includes a state transition probability matrix optimization module, configured to: marking a non-0 data mark in the sample array as 1 to obtain a binary string, converting the binary string into a decimal number, and sequencing and combining the decimal number according to the sequence of sampling time periods from small to large to obtain a decimal sequence; determining a subsequence start bit in the decimal sequence by taking the first bit as 0 and the second bit as a start, extracting non-0 decimal numbers with a second preset length from the subsequence start bit backwards in sequence, and combining the subsequence start bit and the extracted decimal numbers to obtain a subsequence; and inputting the initial probability matrix, the unit background display probability matrix, the initial state transition probability matrix and the subsequence into a hidden Markov model, and processing the state transition probability matrix to be optimized through a second algorithm in the model until the model is converged to obtain the optimized state transition probability matrix.
Optionally, the detection module is configured to: acquiring abnormal behavior sequences of all users to be detected, and comparing numerical values positioned in a data dimension in all the abnormal behavior sequences to obtain a maximum value corresponding to the data dimension; dividing data in the abnormal behavior sequence by a maximum value corresponding to a corresponding data dimension to obtain a fourth array, calculating Euclidean distance between data in the fourth array, and taking the Euclidean distance as an abnormal value; determining the maximum abnormal values of all the users to be detected at this time, comparing whether the maximum abnormal values are larger than the maximum abnormal value at the last time, and taking the maximum abnormal value with a larger value or the maximum abnormal value at the last time as the maximum abnormal value at this time; and calculating the ratio of each abnormal value to the maximum abnormal value of this time to obtain the abnormal score of each user to be detected.
To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided an abnormal behavior detection electronic device.
The electronic device of the embodiment of the invention comprises: one or more processors; a storage device, configured to store one or more programs, and when the one or more programs are executed by the one or more processors, enable the one or more processors to implement any of the above-described abnormal behavior detection methods.
To achieve the above object, according to a further aspect of the embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, the program, when executed by a processor, implementing any of the above abnormal behavior detection methods.
According to the scheme provided by the invention, one embodiment of the invention has the following advantages or beneficial effects: the method comprises the steps of sampling historical behavior data of a user to be detected to generate a sequence to be detected, processing the sequence to obtain an abnormal behavior sequence through a Viterbi detection algorithm, and performing different degrees of evaluation and scoring on the abnormal behavior sequence to position and detect abnormal behaviors of the user.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
fig. 1 is a schematic main flow chart of an abnormal behavior detection method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for constructing a background visualization probability matrix according to an embodiment of the present invention;
FIG. 3 is a flow diagram illustrating a method for generating a sequence of abnormal behavior in accordance with an embodiment of the present invention;
FIG. 4 is a flow chart diagram of a method of generating a table of expected values in accordance with an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a method for generating a state transition probability matrix according to an embodiment of the invention;
FIG. 6 is a flowchart illustrating a method for calculating an abnormality score of a user to be detected according to an embodiment of the present invention;
fig. 7 is a schematic diagram of main blocks of an abnormal behavior detection apparatus according to an embodiment of the present invention;
FIG. 8 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
FIG. 9 is a schematic block diagram of a computer system suitable for use with a mobile device or server implementing an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The letters and meanings used primarily in this scheme are summarized here:
u-data dimension, ui-ith data dimension, u (NorM) -array;
n-the number of sample arrays of the sample users, and M-the number of arrays to be detected of the users to be detected;
t-predetermined time interval, dn-number of sampling units per day, wn-number of sampling units per day;
l-detection sequence length;
q1-decimal sequence of array to be detected, Q2-Q1 filters the sequence to be detected after character 0;
s1-decimal sequence of sample array, S2-S1 sample sequence after filtering character 0;
expected values in the E-expected value table
Referring to fig. 1, a main flowchart of an abnormal behavior detection method provided in an embodiment of the present invention is shown, including the following steps:
s101: sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected;
s102: generating a sequence to be detected based on a data group to be detected, inputting the sequence to be detected into a hidden Markov model together with an initial probability matrix, a background display probability matrix and a transition probability matrix, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string;
s103: processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
In the above embodiment, for step S101, the UEBA technology is established on the basis of big data, and this scheme uses an ELK (abbreviation of three open source software, which respectively represents Elasticsearch, logstack, and Kibana) platform to collect mass user behavior data.
The unit time interval t is preset, and for the convenience of establishing a desired value table (see fig. 4) later, t is preferably set to a value that can be evenly divided by minutes for 24 hours a day, such as 10 minutes (the present embodiment is described by taking 10 minutes as an example).
Starting from a 0 point every day, and collecting historical user behavior data under u data dimensions at time intervals of t. The number of sampling units per day, dn-24 x 60/t, corresponds to dn arrays, each array comprising data corresponding to u data dimensions, of the form:
t1[ dimension 1, dimension 2, … … ]
t2[ dimension 1, dimension 2, … … ]
……
tdn [ dimension 1, dimension 2, … … ]
The statistics of the metadata dimension ui in actual operation include, but are not limited to, the following:
a. and (3) the magnitude series of outgoing data, namely the total number of bytes R of outgoing data in the t time interval, wherein the series R is lgR.
b. And (4) receiving data size series, wherein the total number H of received bytes in the time interval t is taken, and the series H is lgH.
c. And (4) taking the number u1 of times of downloading the file in the time interval t.
d. And (4) uploading the file times, namely taking the uploading times u2 in the time interval t.
e. The number of outgoing mails is the number u3 of outgoing mails in the time interval t.
f. And (4) receiving the mail times, and taking the mail receiving times u4 in the time interval t.
And g, taking the DNS (Domain Name System) request times u5 in the t time interval.
h. And (4) taking the triggering times u6 corresponding to the respective DNS security rule in the time interval t as the triggering times of various DNS security rules.
i. And accessing domain name entropy, and taking the sum u7 of all domain name entropy accessed in the t time interval.
j. The number of target IPs (Internet Protocol, inter-network interconnection Protocol) is the number u8 of different target IPs in the time interval t.
k. And taking the number u9 of the target IP geographical positions in the time interval t.
l. intranet IP request times, and taking the intranet IP request times u10 in the time interval t.
URL (Uniform Resource Locator) classification engine, number of accesses u11 for different types of URLs.
n. number of triggered content security engine alerts, take the number of triggered content security engine alerts u12 in t time interval.
And o, taking the connection number u13 of the DNS periodic heartbeat packets in the t time interval.
p, the number of times of security certificate verification, i.e. the number of times of security certificate verification u14 in the time interval t.
And q, taking the number of times of connecting the abnormal target port in the t time interval u 15.
The embodiment aims at the user to be detected, and the obtained array is the array to be detected. And obtaining M u-dimensional arrays to be detected according to the multi-element sampling operation. And the corresponding relation of the related array, the data dimension and the sampling time period runs through the whole implementation process.
For step S102, all non-0 data flags in the array to be detected are set to 1, and each array is composed of 0 and 1. In order to facilitate the unified management of all the arrays to be detected, 0 and 1 in the array need to be arranged according to the ordering of data dimensions to generate a binary string.
And then, converting the binary string into decimal numbers, and arranging and combining the decimal numbers corresponding to the arrays to be detected according to the sequence of sampling time periods from small to large to obtain a decimal sequence Q1.
Assuming that the current time is Tuesday 10:38, the decimal sequence is generated by:
sampling forward starting from 10: 30:
tuesday 10: 20-10: 30- [ dimension 1, dimension 2 … … ] -u dimension detection array u (1)
Tuesday 10: 10-10: 20- [ dimension 1, dimension 2 … … ] -u dimension detection array u (2)
Tuesday 10: 00-10: 10- [ dimension 1, dimension 2 … … ] -u dimension detection array u (3)
Tuesday 09: 00-10: 00- [ dimension 1, dimension 2 … … ] -u dimension detection array u (4)
.......
If u (1) ═ 10,0,9,78, [1,0,1,1] is obtained via the non-0 data flag, and the binary string 1011 is extracted and converted to the decimal number 11. Since the time period of the sample u (1) is the largest, the decimal number 11 is placed at the end of the sequence, resulting in the decimal sequence Q1 ═ x, x, x, x.
Setting the length of the detection sequence to be L (namely the first preset length), traversing the decimal sequence Q1, sequentially and forwards taking L pieces of non-0 data from the tail of Q1 to form a subsequence, adding a character '0' at the head of the subsequence (the hidden Markov model requires to start with the character '0', and 0 represents no action of a user), and obtaining the sequence Q2 to be detected.
It should be noted that, since the user action may change continuously, a sequence farther from the current time is of little significance for detecting whether the user behavior is abnormal, and therefore, the number of sub-sequences here is only one, and is a sequence closest to the current time.
For example, Q1 ═ 0,0,1,0,7,7,8,0,0,0,0,5,6,5,4,3,0,0], in the sequence from the end of Q1, non-0 data of L ═ 5 is extracted to obtain a subsequence [5,6,5,4,3], a leading character "0" is added to the subsequence to obtain a sequence to be detected Q2 ═ 0,5,6,5,4, 3.
Hidden Markov Models (HMM) are statistical models used to describe a Markov process with Hidden unknown parameters. The scheme uses a matching algorithm under an HMM model, namely, a Viterbi algorithm (i.e., a first algorithm) and a Baum-Welch algorithm (i.e., a second algorithm, see description of fig. 5, which follows), wherein the Baum-Welch algorithm is used for learning, the Viterbi algorithm is used for processing after learning, a character string is input, a character string with the same length is output, and the output character string is referred to as a maximum probability hidden state character string P1.
Using a sequence Q2 to be detected, an initial probability matrix start ═ 1,0,0,0.. 0, a background development probability matrix back _ group _ prob and a transition probability matrix trans _ prob as the input of an HMM model, and processing the sequence Q2 to be detected through a Viterbi algorithm to obtain a maximum probability hidden state character string P1; the process of constructing the background visualization probability matrix back _ group _ prob is described with reference to fig. 2.
For step S103, the abnormal score is calculated based on the abnormal behavior sequence P2, and the abnormal behavior sequence P2 is obtained by processing the maximum probability hidden state string P1, the specific processing procedure is described with reference to the following fig. 3, and the specific processing procedure for calculating the abnormal score is described with reference to the following fig. 6.
An anomaly score threshold value, for example, 50, is preset, and if the anomaly score of the user to be detected exceeds the value, it indicates that the behavior of the user is abnormal, and then, the abnormal processing can be performed through recording, alarming and the like.
The method provided by the embodiment includes sampling historical behavior data of a user to be detected to generate a sequence to be detected, processing the sequence to be detected to obtain an abnormal behavior sequence through a Viterbi detection algorithm, and performing different degrees of evaluation and scoring on the abnormal behavior sequence to position and detect abnormal behaviors of the user.
Referring to fig. 2, a flowchart of a method for constructing a background visualization probability matrix according to an embodiment of the present invention is shown, which includes the following steps:
s201: for an element in the background imaging probability matrix to be assigned, converting the row number in the subscript into a row binary string and converting the column number into a column binary string;
s202: judging whether the bitwise sum result of the row binary string and the column binary string is equal to the row binary string or not;
s203: if not, setting the element to 0;
s204: if the background image probability matrix to be assigned is equal to the background image probability matrix to be assigned, acquiring the total order of the background image probability matrix to be assigned, subtracting 1 from the total order and converting the total order into an order binary string; wherein, the total number of data dimensions with the total order of 2 is the power of the total number;
s205: determining bitwise and results of the row binary string and the rank binary string, and counting the number of 0 in the bitwise and results;
s206: judging whether the number of rows is equal to the number of columns or not;
s207: if so, setting the element as a predetermined background probability threshold;
s208: otherwise, determining the numerical value of the element based on the preset background probability threshold value and the number;
s209: and repeating the converting, judging and assigning processes to obtain the numerical values of all elements in the background imaging probability matrix to be assigned so as to construct the background imaging probability matrix.
In the above embodiment, in steps S201 to S209, the background visualization probability matrix back _ group _ prob is a state × state matrix; wherein, state represents the total order number and is 2uU is the number of data dimensions. Assuming that u is 3, state is 8, and back _ group _ prob is an 8-dimensional × 8-dimensional matrix.
The process of constructing back _ group _ prob comprises the following steps:
let i, j be an element a in the matrix respectivelyijNumber of rows and columns of subscripts, to element aijThe assignment process is as follows:
1) when (i to binary)&When (j-to-binary) is not equal to (i-to-binary), aij0. Wherein the bitwise and operator "&"is a binocular operator, and its function is that the two numbers involved in operation respectively have corresponding binary AND, only have correspondingWhen both the two binary bits are 1, the operation result is 1.
For example, if i is 6, j is 13, the 6-to-binary is 111, the 13-to-binary is 1101, the result of bitwise and of the two is 0101, and a is different from 1116,13=0。
2) When (i to binary) & (j to binary) is equal to (i to binary), b is set equal to (i to binary) bitwise AND ((state-1) to binary), and c is the number of 0's in b.
Since state is 2uThus, it is in the form of "1 + multiple 0" binary, e.g., 10000 for 16 transitions binary. i. j is less than or equal to state, if the direct (i-to-binary) bitwise AND (state-to-binary) result is 0, which is not fit for actual operation, so that (state-1) to binary, for example (16-1) to binary is 1111, to determine the total length of the binary string.
For example, i is 6, j is 15, the 6-transition bin is 110, the 15-transition bin is 1111, and the result of bitwise and of both is 0110 and still 6. Suppose the state is 16, and the bitwise and result of 110 and 1111 is 0110, i.e. b is 0110 and c is 2. Assuming that state is 32, (32-1) binary 11111, and the bitwise and result of 110 and 11111 is 00110, i.e. b is 00110 and c is 3.
I) Setting a background probability threshold thred, and when i is equal to j, aij is thred; let thred be 0.8, when i equals j, aij be 0.8;
II) aij ═ 1-thred)/c when i is not equal to j. Let thred be 0.8 and aij be 0.2/c when i is not equal to j.
And repeating the process, and assigning values to each element in the matrix to construct the background visualization probability matrix.
In the above embodiment, the assignment result is related to the number of rows and columns of elements and the number of data dimensions, and a fixed background imaging probability matrix is obtained by a bitwise and a comparison manner for the hidden markov model to process the sequence to be detected.
Referring to fig. 3, a flowchart of a method for generating an abnormal behavior sequence according to an embodiment of the present invention is shown, including the following steps:
s301: removing the first character 0 in the maximum probability hidden state character string to obtain a first character string;
s302: converting characters in the first character string from decimal numbers into binary strings according to the dimension number of a plurality of data dimensions, and generating a first array corresponding to the characters based on the binary strings;
s303: determining first data which are not 0 in a first array, and acquiring a first data dimension and a sampling time period corresponding to the first data;
s304: inquiring an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension, and replacing the first data with the expected value to obtain a second array;
s305: and determining an array to be detected corresponding to the sampling time period, subtracting data in the second array and the array to be detected, taking an absolute value to obtain a third array, and combining the third array according to the sequence of the sampling time period from small to large to obtain an abnormal behavior sequence.
In the above embodiment, as for steps S301 to S304, the present embodiment is used to describe how to generate an abnormal behavior sequence.
After the maximum probability hidden state character string P1 is obtained, the first character "0" in P1 is removed, and a character string P2 with the length L is obtained. Each character in the character string P2 is converted from decimal number to binary string, and the binary string is converted into the first array with u dimension composed of 0 and 1.
And determining first data with the median of 1 in the first array, replacing the first data with an expected value E of the corresponding first data dimension in the corresponding single-dimensional expected value table, and finally obtaining a second array, namely a background behavior sequence.
As illustrated in step S102 in fig. 1, u (1) ═ 10,0,9,78], the decimal sequence Q1 ═ x, x, x, x, x.
After Q2 is processed by the Viterbi algorithm, P1 is obtained [0, x, x, x, x, x.... times, x,4], and the leading character "0" is removed, resulting in P2 [ x, x, x, x.. times, x,4 ]. For a total of 4 data dimensions, 4 is converted to a binary string 0100 with a corresponding array u (1)' [0,1,0,0 ].
The second bit in the array u (1)' is 1, corresponding to the u2 dimension, with a sampling period of 10:20 to 10:30 on tuesdays. From the expectation value table corresponding to u2, an expectation value 9 corresponding to "tuesday 10:20 to 10: 30" is looked up, resulting in u (1) "[ 0,9,0,0 ].
For step S305, subtracting the data in the array to be detected obtained by the original sampling from each array in the background behavior sequence, and taking the absolute value, and combining the third array in a new form according to the sequence of the sampling time periods from small to large to obtain the abnormal behavior sequence.
Similarly, in the above example, the array to be measured obtained by sampling on tuesday 10:20 to 10:30 is u (1) ═ 10,0,9,78], and u (1) ═ 0,9,0,0] is obtained by the background behavior sequence processing, and the absolute value of u (1) '- | u (1) -u (1)' - | 10,9,9,78 is obtained by subtraction.
In the above embodiment, after the maximum probability hidden state character string is obtained through processing, an abnormal behavior sequence needs to be generated through a certain mathematical calculation. The background behavior sequence may provide input data for other machine learning and analysis engines of UEBA in addition to generating abnormal behavior sequences.
Referring to fig. 4, a flowchart of a method for generating an expectation table according to an embodiment of the present invention is shown, including the following steps:
s401: sampling historical behavior data of a sample user based on a plurality of data dimensions in the preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate a sample array;
s402: dividing the preset time length into a plurality of time periods by taking the preset time interval as a unit, determining a statistical time interval under each statistical dimension by taking one time period as a center, and acquiring data corresponding to one data dimension and one statistical time interval from the sample array;
s403: calculating the expected value of the data dimension in the time period based on the data acquired under each statistical dimension, and repeating the data acquisition and calculation operation to obtain the expected value of the data dimension in each time period;
s404: and sequencing the expected values according to the sequence of the time periods from small to large to obtain an expected value table corresponding to the data dimension.
In the foregoing embodiment, as for step S401, the present embodiment is used to describe the process of establishing the single-dimensional expectation table, the adopted data is the historical behavior data of the sample user, the generated array is the sample array, and it is assumed that N sample arrays are generated.
For steps S402 and S403, the time of week (59 minutes 59 seconds from monday 0 to sunday 23) is divided into 24 × 60 × 7/t cells (or time periods) in units of t. Assuming that the user has 100 cycles of data in the data dimension ui, the data is divided into 100 wn cells, and 100 wn sample arrays are obtained by sampling. An expected value table of the data dimension ui is constructed by finding an expected value of the data dimension ui in each unit time period.
The scheme adopts four statistical dimensions, and the statistical time intervals adopted by different statistical dimensions are different. Assuming that the expected value of the data dimension ui at [ Monday 0, Monday 0 point + t ] is calculated, the specific calculation steps are as follows:
1. point counting: all data corresponding to the data dimension ui in the cell are counted in the same time period in units of weeks, and an average value s _ mean and an occurrence probability s _ rate are obtained.
1) Determining a first statistical time interval: [ monday 0 point, monday 0 point + t ], acquiring 100 data corresponding to the data dimension ui and the first statistical time interval from the sample array;
2) these 100 data were averaged to obtain s _ mean. The data amount of the non-0 data is determined, and the ratio s _ rate of the data amount in 100 is calculated.
2. And (3) transverse statistics: and counting all data corresponding to the data dimension ui in the cells in the same time period with the day as a unit, and obtaining an average value h _ mean and an occurrence probability h _ rate.
1) Determining a first statistical time interval: (0 point per day, 0 point per day + t), acquiring data corresponding to the data dimension ui and the first statistical time interval from the sample array, wherein the total number of the data dimension ui and the first statistical time interval is 100 × 7-700;
2) averaging these 700 data yields h mean. The data amount of non-0 data is determined, and the proportion h _ rate of the data amount in 700 is calculated.
3. Longitudinal statistics: it is required to float a plurality of t vertically on a unit (or time period) basis, also in units of weeks. And counting all data corresponding to the data dimension ui in the cell to obtain an average value v _ mean and an occurrence probability v _ rate.
Since one week is 7 days, it is preferable to float 3t vertically so that 100 wn can be divided evenly, and the substantial horizontal statistics also float horizontally in units of days. For example, after floating longitudinally 3t on the basis of [ monday 0, monday 0 point + t ], it becomes [ monday 0 point-3 t, monday 0 point +4t ], including 7 periods:
[ Monday 0 point-3 t, Monday 0 point-2 t ]
[ Monday 0 point-2 t, Monday 0 point-t ]
[ Monday 0 Point-t, Monday 0 Point ]
[ Monday 0, Monday 0 Point + t ]
[ Monday t, Monday 0 Point +2t ]
[ Monday 0+2t, Monday 0 point +3t ]
[ Monday 0+3t, Monday 0 point +4t ]
1) Determining a first statistical time interval under longitudinal statistics: [ monday 0 point-3 t, monday 0 point +4t ], acquiring data corresponding to the data dimension ui and the first statistical time interval from the sample array, wherein 100 × 7 is 700;
2) averaging these 700 data yields v _ mean. The data amount of the non-0 data is determined, and the ratio v _ rate of the data amount in 700 is calculated.
4. Global statistics: and taking all the cells as a statistical time interval, and counting all the data of the data dimension ui corresponding to the cells to obtain an average value g _ mean and an occurrence probability g _ rate.
For example, data corresponding to the data dimension ui in all sample arrays are obtained, 100 × wn is counted, an average value is obtained to obtain g _ mean, and the proportion of the non-0 data in 100 × wn is calculated to obtain an occurrence probability g _ rate.
For the data dimension ui and the same time period [ monday 0 point, monday 0 point + t ], after calculating the average value and the ratio under different statistical dimensions, the expected value of the data dimension ui in the time period can be calculated:
1) setting E-0 when s _ rate-0 and g _ rate-v _ rate-h _ rate-0;
2) in other cases, E ═ m _ s _ rate + h _ mean _ h _ rate · s _ rate + v _ mean _ v _ rate · s _ rate + g _ mean _ g _ rate ═ v _ rate h _ rate)/(s _ rate + h _ rate · s _ rate + v _ rate · s _ rate + g _ rate _ v _ rate)
The formula is an empirical formula, weighting (1, s _ rate, s _ rate, v _ rate, h _ rate) is respectively carried out according to the occurrence probability of each of the four statistical dimensions, and the weighted average value of the data dimension ui in a time period is calculated to obtain an expected value; wherein, the point statistic has the largest weight, and the global statistic has the smallest weight.
The setting means that there are two extremes of data for a fixed period of time per week: 1) the data dimension ui of the user has a value for a fixed period of time per week for 100 weeks; 2) the data dimension ui of the user is 0 for a fixed period of time per week for 100 weeks.
As for step S404, it can be known from the above description that point statistics totally obtain wn S _ mean and S _ rate, lateral statistics and longitudinal statistics both obtain 24 × 60/t mean values and occurrence probabilities, and global statistics only obtain one g _ mean and one g _ rate.
The scheme calculates the expected value of the data dimension ui in each cell, namely totally obtains wn E. E is sorted in order of time period from small to large to generate a table of expected values corresponding to the data dimension ui. For the same reason for other data dimensions, a total of u single-dimensional expectation tables are obtained.
According to the method provided by the embodiment, aiming at the expected value of a data dimension in the same time period, the data is comprehensively considered from the point dimension, the transverse dimension, the longitudinal dimension and the global dimension and the calculation is integrated, so that the coverage range of the expected value and the calculation accuracy are improved.
Referring to fig. 5, a flowchart of a method for generating a state transition probability matrix according to an embodiment of the present invention is shown, including the following steps:
s501: marking a non-0 data mark in the sample array as 1 to obtain a binary string, converting the binary string into a decimal number, and sequencing and combining the decimal number according to the sequence of sampling time periods from small to large to obtain a decimal sequence;
s502: determining a subsequence start bit in the decimal sequence by taking the first bit as 0 and the second bit as a start, extracting non-0 decimal numbers with a second preset length from the subsequence start bit backwards in sequence, and combining the subsequence start bit and the extracted decimal numbers to obtain a subsequence;
s503: and inputting the initial probability matrix, the unit background display probability matrix, the initial state transition probability matrix and the subsequence into a hidden Markov model, and processing the state transition probability matrix to be optimized through a second algorithm in the model until the model is converged to obtain the optimized state transition probability matrix.
In the above embodiment, in step S501, after the historical behavior data of the sample user is sampled in a multivariate manner, N u-dimensional sample arrays are generated.
Marking non-0 data in the sample array as 1, finally making each array from 0 and 1, sequencing according to the data dimension sequence to generate binary strings, and converting the binary strings into corresponding decimal numbers. For example, a 5-dimensional array [10,0,1.25,0,0], is marked by non-0 data to obtain [1,0,1,0,0], and then a binary string "10100" is obtained, and finally converted into a decimal number 20.
Finally, the N u-dimensional sample arrays are all converted into decimal numbers, and the N decimal numbers are sorted and combined according to the sequence of the sampling time periods of the sample arrays to generate a decimal sequence S1, such as [0,0,1,0,7,9,8,0,0,0,0,5,6,5,4,3,0,0 ].
For step S502, set the length of the check sequence to L, traverse the decimal sequence S1, start with "0 and non-0 as the second position" as the start of the subsequence, take (L-1, i.e., the second predetermined length) non-0 decimal numbers backwards at the corresponding position in the sequence S1, and combine to generate the subsequence S2. The hidden markov model requires that the character "0" is started, 0 indicates that the user has no action, and the initial probability matrix can be fixed.
For example, S1 is [0,0,1,0,7,9,8,0,0,0,0,5,6,5,4,3,0,0], assuming that L is 5, three sequences [0,1], [0,7] and [0,5] are in common with "0 starts and non-0 is the second position", and finally three subsequences [0,1,7,9,8,5], [0,7,9,8,5,6] and [0,5,6,5,4,3] are obtained.
In actual operation, although S1 is a super long string, a case may occur in which the number of non-0S after the second position does not satisfy (L-1). For example, S1 is not equal to "L is equal to 5" but is discarded in [0,0,1,0,7,9,8,0,0,0,0,5,6,5,0, 0], and the third [0,5,6,5] is not equal to "L is not equal to 5", and finally only two subsequences [0,1,7,9,8,5] and [0,7,9,8,5,6] are obtained.
In step S503, the obtained subsequences are collected and used as input of a Baum-Welch algorithm (i.e., a second algorithm) in the hidden markov model, a start ═ 1,0,0,0.. 0] is used as an initial probability matrix, a state order unit matrix is used as a visualization probability matrix, and a matrix whose state dimensional values are all 1/state is used as an initial transition probability matrix, the transition probability matrix is continuously learned and optimized until the model converges, and the optimized transition probability matrix trans _ prob is stored.
The method provided by the above embodiment also learns the sample array based on the sample array of the sample user by using the Baum-Welch algorithm to generate the transition probability matrix.
Referring to fig. 6, a flowchart of a method for calculating an abnormality score of a user to be detected according to an embodiment of the present invention is shown, including the following steps:
s601: acquiring abnormal behavior sequences of all users to be detected, and comparing numerical values positioned in a data dimension in all the abnormal behavior sequences to obtain a maximum value corresponding to the data dimension;
s602: dividing data in the abnormal behavior sequence by a maximum value corresponding to a corresponding data dimension to obtain a fourth array, calculating Euclidean distance between data in the fourth array, and taking the Euclidean distance as an abnormal value;
s603: determining the maximum abnormal values of all the users to be detected;
s604: comparing whether the maximum abnormal value is larger than the last maximum abnormal value or not, and taking the maximum abnormal value with a larger value or the last maximum abnormal value as the current maximum abnormal value;
s605: and calculating the ratio of each abnormal value to the maximum abnormal value of this time to obtain the abnormal score of each user to be detected.
In the above embodiment, in steps S601 to S602, it is necessary to consider the abnormal behavior sequences of all the users to be detected at this time. And counting and comparing the numerical values of all the abnormal behavior sequences under the data dimension ui to obtain the maximum value max corresponding to the data dimension ui.
And (3) dividing the abnormal behavior sequence of a single user to be detected by the maximum value max under the corresponding data dimension ui to perform normalization processing, and forming a new u-dimensional array in the interval of [0,1 ]. And converting the u-dimensional array into a corresponding Euclidean distance and accumulating to obtain an abnormal value of the user.
The Euclidean distance is the distance between two points in two-dimensional and three-dimensional spaces, the two-dimensional is d ═ sqrt ((x1-x2) ^ (y1-y2) ^ and the three-dimensional is d ═ sqrt (x1-x2) ^ (y1-y2) ^ (z1-z2) ^). Generalizing to the n-dimensional space, the formula for calculating the euclidean distance is d ═ sqrt (Σ (xl1-xl2) ^), where l ═ 1,2.. n, xl1 denotes the l-th dimensional coordinate of the first point, and xl2 denotes the l-th dimensional coordinate of the second point.
For steps S603 and S605, assuming that the number of the users to be detected is 100, after the respective abnormal values are obtained by calculation, the maximum abnormal value is determined by comparison, for example, 50.
The formula for calculating the anomaly score is: score is subscriber outlier/max all subscriber outliers. The maximum abnormal values of all users are the maximum abnormal values from the beginning of calculation to the current time, the maximum abnormal value obtained at this time needs to be compared with the maximum abnormal value obtained at the last time each time, and the maximum abnormal value with a larger value is used as the maximum abnormal value at this time.
In addition, the validity period of the abnormal behavior sequence may be set to T, for example, T is 30 days, and the expired abnormal behavior sequence is cleared if the time exceeds 30 days, so as to reduce the occupancy of resources.
In the method provided by the embodiment, each calculation needs to be compared with the maximum abnormal value adopted in the previous calculation, and the numerical value is kept higher, so that the transmission and the continuous updating of the maximum abnormal value are realized.
Referring to fig. 7, a schematic diagram of main modules of an abnormal behavior detection apparatus 700 according to an embodiment of the present invention is shown, including:
the sampling module 701 is used for sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected;
a processing module 702, configured to generate a sequence to be detected based on a to-be-detected array, input the sequence to be detected into a hidden markov model together with a start probability matrix, a background imaging probability matrix, and a transition probability matrix, and process the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string;
the detection module 703 is configured to process the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculate an abnormal score of the abnormal behavior sequence, and compare the abnormal score with a predetermined score threshold to identify whether the behavior of the user to be detected is abnormal.
In the device for implementing the present invention, the processing module 702 is configured to: marking a non-0 data mark in an array to be detected as 1 to obtain a binary string, and converting the binary string into a decimal number; sequencing and combining the decimal numbers of the arrays to be detected according to the sequence of the sampling time periods from small to large to obtain a decimal sequence; and extracting non-0 decimal numbers with a first preset length in sequence from the tail part of the decimal sequence to obtain a subsequence, and adding 0 to the head part of the subsequence to obtain the sequence to be detected.
The apparatus for implementing the present invention further includes a background visualization probability matrix constructing module 704 (not shown in the figure) for: for an element in the background imaging probability matrix to be assigned, converting the row number in the subscript into a row binary string and converting the column number into a column binary string; judging whether the bitwise and the result of the row binary string and the column binary string are equal to the row binary string or not, and if not, setting the element to be 0; if the background image probability matrix to be assigned is equal to the background image probability matrix to be assigned, acquiring the total order of the background image probability matrix to be assigned, subtracting 1 from the total order and converting the total order into an order binary string; wherein, the total number of data dimensions with the total order of 2 is the power of the total number; determining the bitwise and the result of the row binary string and the rank binary string, counting the number of 0 in the bitwise and the result, and judging whether the row number is equal to the column number or not; if the element number is equal to the preset background probability threshold, setting the element as a preset background probability threshold, otherwise, determining the numerical value of the element based on the preset background probability threshold and the number; and repeating the converting, judging and assigning processes to obtain the numerical values of all elements in the background imaging probability matrix to be assigned so as to construct the background imaging probability matrix.
In the device for implementing the present invention, the detecting module 703 is configured to: removing the first character 0 in the maximum probability hidden state character string to obtain a first character string; converting characters in the first character string from decimal numbers to binary strings according to the dimension number of the plurality of data dimensions, so as to generate a first array corresponding to the characters based on the binary strings; determining first data which are not 0 in a first array, and acquiring a first data dimension and a sampling time period corresponding to the first data; inquiring an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension, and replacing the first data with the expected value to obtain a second array; and determining the array to be detected corresponding to the sampling time period, subtracting the second array from the array to be detected, taking an absolute value to obtain a third array, and combining the third arrays according to the sequence of the sampling time periods from small to large to obtain an abnormal behavior sequence.
The apparatus further comprises an expected value table constructing module 705 (not shown in the figure) for: sampling historical behavior data of a sample user based on a plurality of data dimensions in the preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate a sample array; dividing the preset time length into a plurality of time periods by taking the preset time interval as a unit, determining a statistical time interval under each statistical dimension by taking one time period as a center, and acquiring data corresponding to one data dimension and one statistical time interval from the sample array; calculating the expected value of the data dimension in the time period based on the data acquired under each statistical dimension, and repeating the data acquisition and calculation operation to obtain the expected value of the data dimension in each time period; and sequencing the expected values according to the sequence of the time periods from small to large to obtain an expected value table corresponding to the data dimension.
In the device for implementing the invention, the statistical dimensions comprise point statistics, transverse statistics, longitudinal statistics and global statistics; the expected value table building module 705 is configured to: the statistical time interval under the point statistics is the same time period taking the week as a unit; the statistical time interval under the transverse statistics is the same time period taking day as a unit; the statistical time interval under the longitudinal statistics is a time period which takes a week as a unit and takes the time period as a center and longitudinally floats for a preset number; and the statistical time interval under the global statistics is the preset time length.
In the device for implementing the present invention, the expected value table constructing module 705 is configured to: for a statistic dimension, calculating the average value of the acquired data, counting the total amount of the acquired data and the amount of which the value is not zero, and calculating the ratio of the amount to the total amount; when the average value under the point statistics is zero and the average value of at least one of the transverse statistics, the longitudinal statistics and the global statistics is zero, setting the expected value of the data dimension in the time period to be zero; otherwise, calculating the expected value of the data dimension in the time period based on the average value and the ratio under the point statistics, the average value and the ratio under the transverse statistics and the average value and the ratio under the global statistics.
The apparatus further comprises a state transition probability matrix optimization module 706 (not shown in the figure), configured to: marking a non-0 data mark in the sample array as 1 to obtain a binary string, converting the binary string into a decimal number, and sequencing and combining the decimal number according to the sequence of sampling time periods from small to large to obtain a decimal sequence; determining a subsequence start bit in the decimal sequence by taking the first bit as 0 and the second bit as a start, extracting non-0 decimal numbers with a second preset length from the subsequence start bit backwards in sequence, and combining the subsequence start bit and the extracted decimal numbers to obtain a subsequence; and inputting the initial probability matrix, the unit background display probability matrix, the initial state transition probability matrix and the subsequence into a hidden Markov model, and processing the state transition probability matrix to be optimized through a second algorithm in the model until the model is converged to obtain the optimized state transition probability matrix.
In the device for implementing the present invention, the detecting module 703 is configured to: acquiring abnormal behavior sequences of all users to be detected, and comparing numerical values positioned in a data dimension in all the abnormal behavior sequences to obtain a maximum value corresponding to the data dimension; dividing data in the abnormal behavior sequence by a maximum value corresponding to a corresponding data dimension to obtain a fourth array, calculating Euclidean distance between data in the fourth array, and taking the Euclidean distance as an abnormal value; determining the maximum abnormal values of all the users to be detected at this time, comparing whether the maximum abnormal values are larger than the maximum abnormal value at the last time, and taking the maximum abnormal value with a larger value or the maximum abnormal value at the last time as the maximum abnormal value at this time; and calculating the ratio of each abnormal value to the maximum abnormal value of this time to obtain the abnormal score of each user to be detected. In addition, the detailed implementation of the device in the embodiment of the present invention has been described in detail in the above method, so that the repeated description is not repeated here.
FIG. 8 illustrates an exemplary system architecture 800 to which embodiments of the invention may be applied.
As shown in fig. 8, the system architecture 800 may include terminal devices 801, 802, 803, a network 804, and a server 805 (by way of example only). The network 804 serves to provide a medium for communication links between the terminal devices 801, 802, 803 and the server 805. Network 804 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 801, 802, 803 to interact with a server 805 over a network 804 to receive or send messages or the like. Various communication client applications may be installed on the terminal devices 801, 802, 803.
The terminal devices 801, 802, 803 may be various electronic devices having display screens and supporting web browsing, and the server 805 may be a server providing various services.
It is to be noted that the method provided by the embodiment of the present invention is generally executed by the server 805, and accordingly, the apparatus is generally disposed in the server 805.
It should be understood that the number of terminal devices, networks, and servers in fig. 8 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 9, shown is a block diagram of a computer system 900 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 9 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 9, the computer system 900 includes a Central Processing Unit (CPU)901 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)902 or a program loaded from a storage section 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data necessary for the operation of the system 900 are also stored. The CPU 901, ROM 902, and RAM 903 are connected to each other via a bus 904. An input/output (I/O) interface 905 is also connected to bus 904.
The following components are connected to the I/O interface 905: an input portion 906 including a keyboard, a mouse, and the like; an output section 907 including components such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 908 including a hard disk and the like; and a communication section 909 including a network interface card such as a LAN card, a modem, or the like. The communication section 909 performs communication processing via a network such as the internet. The drive 910 is also connected to the I/O interface 905 as necessary. A removable medium 911 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 910 as necessary, so that a computer program read out therefrom is mounted into the storage section 908 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 909, and/or installed from the removable medium 911. The above-described functions defined in the system of the present invention are executed when the computer program is executed by a Central Processing Unit (CPU) 901.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprises a sampling module, a processing module and a detection module. Where the names of these modules do not in some cases constitute a limitation on the module itself, for example, a detection module may also be described as an "anomaly detection module".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected;
generating a sequence to be detected based on a data group to be detected, inputting the sequence to be detected into a hidden Markov model together with an initial probability matrix, a background display probability matrix and a transition probability matrix, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string;
processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
According to the technical scheme of the embodiment of the invention, expected value tables and transition probability matrixes of different data dimensions are constructed in advance through sample data, and a background visualization probability matrix is constructed according to the number of the data dimensions and the subscripts of elements. The method comprises the steps of sampling historical behavior data of a user to be detected to generate a sequence to be detected, processing the sequence to obtain an abnormal behavior sequence through a Viterbi detection algorithm, and performing different degrees of evaluation and scoring on the abnormal behavior sequence to position and detect abnormal behaviors of the user.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. An abnormal behavior detection method, comprising:
sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected;
generating a sequence to be detected based on a data group to be detected, inputting the sequence to be detected into a hidden Markov model together with an initial probability matrix, a background display probability matrix and a transition probability matrix, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string;
processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
2. The method according to claim 1, wherein the generating the sequence to be detected based on the array to be detected comprises:
marking a non-0 data mark in an array to be detected as 1 to obtain a binary string, and converting the binary string into a decimal number;
sequencing and combining the decimal numbers of the arrays to be detected according to the sequence of the sampling time periods from small to large to obtain a decimal sequence;
and extracting non-0 decimal numbers with a first preset length in sequence from the tail part of the decimal sequence to obtain a subsequence, and adding 0 to the head part of the subsequence to obtain the sequence to be detected.
3. The method according to claim 1, before the generating the sequence to be detected based on the array to be detected and inputting the sequence to be detected into the hidden markov model together with the initial probability matrix, the background visualization probability matrix and the transition probability matrix, further comprising:
for an element in the background imaging probability matrix to be assigned, converting the row number in the subscript into a row binary string and converting the column number into a column binary string;
judging whether the bitwise and the result of the row binary string and the column binary string are equal to the row binary string or not, and if not, setting the element to be 0;
if the background image probability matrix to be assigned is equal to the background image probability matrix to be assigned, acquiring the total order of the background image probability matrix to be assigned, subtracting 1 from the total order and converting the total order into an order binary string; wherein, the total number of data dimensions with the total order of 2 is the power of the total number;
determining the bitwise and the result of the row binary string and the rank binary string, counting the number of 0 in the bitwise and the result, and judging whether the row number is equal to the column number or not;
if the element number is equal to the preset background probability threshold, setting the element as a preset background probability threshold, otherwise, determining the numerical value of the element based on the preset background probability threshold and the number;
and repeating the converting, judging and assigning processes to obtain the numerical values of all elements in the background imaging probability matrix to be assigned so as to construct the background imaging probability matrix.
4. The method according to claim 1, wherein said processing said maximum probability hidden state string to obtain an abnormal behavior sequence comprises:
removing the first character 0 in the maximum probability hidden state character string to obtain a first character string;
converting characters in the first character string from decimal numbers to binary strings according to the dimension number of the plurality of data dimensions, so as to generate a first array corresponding to the characters based on the binary strings;
determining first data which are not 0 in a first array, and acquiring a first data dimension and a sampling time period corresponding to the first data;
inquiring an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension, and replacing the first data with the expected value to obtain a second array;
and determining the array to be detected corresponding to the sampling time period, subtracting the second array from the array to be detected, taking an absolute value to obtain a third array, and combining the third arrays according to the sequence of the sampling time periods from small to large to obtain an abnormal behavior sequence.
5. The method of claim 4, further comprising, prior to said looking up an expected value corresponding to the sampling time period from an expected value table corresponding to the first data dimension:
sampling historical behavior data of a sample user based on a plurality of data dimensions in the preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate a sample array;
dividing the preset time length into a plurality of time periods by taking the preset time interval as a unit, determining a statistical time interval under each statistical dimension by taking one time period as a center, and acquiring data corresponding to one data dimension and one statistical time interval from the sample array;
calculating the expected value of the data dimension in the time period based on the data acquired under each statistical dimension, and repeating the data acquisition and calculation operation to obtain the expected value of the data dimension in each time period;
and sequencing the expected values according to the sequence of the time periods from small to large to obtain an expected value table corresponding to the data dimension.
6. The method of claim 5, wherein said statistical dimensions include point statistics, lateral statistics, vertical statistics, and global statistics;
the determining the statistical time interval under each statistical dimension by taking a time period as a center comprises the following steps:
the statistical time interval under the point statistics is the same time period taking the week as a unit;
the statistical time interval under the transverse statistics is the same time period taking day as a unit;
the statistical time interval under the longitudinal statistics is a time period which takes a week as a unit and takes the time period as a center and longitudinally floats for a preset number;
and the statistical time interval under the global statistics is the preset time length.
7. The method of claim 6, wherein calculating the expected value of the data dimension for the time period based on the data acquired for each statistical dimension comprises:
for a statistic dimension, calculating the average value of the acquired data, counting the total amount of the acquired data and the amount of which the value is not zero, and calculating the ratio of the amount to the total amount;
when the average value under the point statistics is zero and the average value of at least one of the transverse statistics, the longitudinal statistics and the global statistics is zero, setting the expected value of the data dimension in the time period to be zero;
otherwise, calculating the expected value of the data dimension in the time period based on the average value and the ratio under the point statistics, the average value and the ratio under the transverse statistics and the average value and the ratio under the global statistics.
8. The method of claim 5, further comprising, after said combining the resulting data to generate an array of samples:
marking a non-0 data mark in the sample array as 1 to obtain a binary string, converting the binary string into a decimal number, and sequencing and combining the decimal number according to the sequence of sampling time periods from small to large to obtain a decimal sequence;
determining a subsequence start bit in the decimal sequence by taking the first bit as 0 and the second bit as a start, extracting non-0 decimal numbers with a second preset length from the subsequence start bit backwards in sequence, and combining the subsequence start bit and the extracted decimal numbers to obtain a subsequence;
and inputting the initial probability matrix, the unit background display probability matrix, the initial state transition probability matrix and the subsequence into a hidden Markov model, and processing the state transition probability matrix to be optimized through a second algorithm in the model until the model is converged to obtain the optimized state transition probability matrix.
9. The method of claim 1, wherein the calculating the abnormal score for the sequence of abnormal behaviors comprises:
acquiring abnormal behavior sequences of all users to be detected, and comparing numerical values positioned in a data dimension in all the abnormal behavior sequences to obtain a maximum value corresponding to the data dimension;
dividing data in the abnormal behavior sequence by a maximum value corresponding to a corresponding data dimension to obtain a fourth array, calculating Euclidean distance between data in the fourth array, and taking the Euclidean distance as an abnormal value;
determining the maximum abnormal values of all the users to be detected at this time, comparing whether the maximum abnormal values are larger than the maximum abnormal value at the last time, and taking the maximum abnormal value with a larger value or the maximum abnormal value at the last time as the maximum abnormal value at this time;
and calculating the ratio of each abnormal value to the maximum abnormal value of this time to obtain the abnormal score of each user to be detected.
10. An abnormal behavior detection apparatus, comprising:
the sampling module is used for sampling historical behavior data of a user to be detected based on a plurality of data dimensions within a preset time interval, and combining the historical behavior data of the plurality of data dimensions to generate an array to be detected;
the processing module is used for generating a sequence to be detected based on the array to be detected, combining the initial probability matrix, the background imaging probability matrix and the transition probability matrix, inputting the sequence to be detected into the hidden Markov model together, and processing the sequence to be detected through a first algorithm in the model to obtain a maximum probability hidden state character string;
and the detection module is used for processing the maximum probability hidden state character string to obtain an abnormal behavior sequence, calculating an abnormal score of the abnormal behavior sequence, and comparing the abnormal score with a preset score threshold value to identify whether the behavior of the user to be detected is abnormal.
11. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.
12. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.
CN202010669473.6A 2020-07-13 2020-07-13 Abnormal behavior detection method and device Active CN111913859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010669473.6A CN111913859B (en) 2020-07-13 2020-07-13 Abnormal behavior detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010669473.6A CN111913859B (en) 2020-07-13 2020-07-13 Abnormal behavior detection method and device

Publications (2)

Publication Number Publication Date
CN111913859A true CN111913859A (en) 2020-11-10
CN111913859B CN111913859B (en) 2023-11-14

Family

ID=73228024

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010669473.6A Active CN111913859B (en) 2020-07-13 2020-07-13 Abnormal behavior detection method and device

Country Status (1)

Country Link
CN (1) CN111913859B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN112465073A (en) * 2020-12-23 2021-03-09 上海观安信息技术股份有限公司 Numerical value distribution anomaly detection method and system based on distance
CN114500075A (en) * 2022-02-11 2022-05-13 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN114996113A (en) * 2022-07-28 2022-09-02 成都乐超人科技有限公司 Real-time monitoring and early warning method and device for abnormal operation of large-data online user
CN115314395A (en) * 2022-07-11 2022-11-08 中电信数智科技有限公司 Method for optimizing NR signal coverage drive test
CN116502055A (en) * 2023-01-10 2023-07-28 昆明理工大学 Multi-dimensional characteristic dynamic abnormal integral model based on quasi-Markov model
CN117272178A (en) * 2023-11-17 2023-12-22 北京易动宇航科技有限公司 Fault diagnosis method of electric propulsion system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3296943A1 (en) * 2015-05-13 2018-03-21 Alibaba Group Holding Limited Method of processing exchanged data and device utilizing same
CN109818942A (en) * 2019-01-07 2019-05-28 微梦创科网络科技(中国)有限公司 A kind of user account number method for detecting abnormality and device based on temporal aspect
CN110210508A (en) * 2018-12-06 2019-09-06 北京奇艺世纪科技有限公司 Model generating method, anomalous traffic detection method, device, electronic equipment, computer readable storage medium
CN110969556A (en) * 2019-09-30 2020-04-07 上海仪电(集团)有限公司中央研究院 Method and device for detecting river water quality abnormity by machine learning multi-dimension multi-model fusion

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3296943A1 (en) * 2015-05-13 2018-03-21 Alibaba Group Holding Limited Method of processing exchanged data and device utilizing same
CN110210508A (en) * 2018-12-06 2019-09-06 北京奇艺世纪科技有限公司 Model generating method, anomalous traffic detection method, device, electronic equipment, computer readable storage medium
CN109818942A (en) * 2019-01-07 2019-05-28 微梦创科网络科技(中国)有限公司 A kind of user account number method for detecting abnormality and device based on temporal aspect
CN110969556A (en) * 2019-09-30 2020-04-07 上海仪电(集团)有限公司中央研究院 Method and device for detecting river water quality abnormity by machine learning multi-dimension multi-model fusion

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112434245A (en) * 2020-11-23 2021-03-02 北京八分量信息科技有限公司 Method and device for judging abnormal behavior event based on UEBA (unified extensible architecture), and related product
CN112465073A (en) * 2020-12-23 2021-03-09 上海观安信息技术股份有限公司 Numerical value distribution anomaly detection method and system based on distance
CN112465073B (en) * 2020-12-23 2023-08-08 上海观安信息技术股份有限公司 Numerical distribution abnormity detection method and detection system based on distance
CN114500075A (en) * 2022-02-11 2022-05-13 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN114500075B (en) * 2022-02-11 2023-11-07 中国电信股份有限公司 User abnormal behavior detection method and device, electronic equipment and storage medium
CN115314395A (en) * 2022-07-11 2022-11-08 中电信数智科技有限公司 Method for optimizing NR signal coverage drive test
CN115314395B (en) * 2022-07-11 2024-01-19 中电信数智科技有限公司 NR signal coverage drive test optimization method
CN114996113A (en) * 2022-07-28 2022-09-02 成都乐超人科技有限公司 Real-time monitoring and early warning method and device for abnormal operation of large-data online user
CN116502055A (en) * 2023-01-10 2023-07-28 昆明理工大学 Multi-dimensional characteristic dynamic abnormal integral model based on quasi-Markov model
CN116502055B (en) * 2023-01-10 2024-05-03 昆明理工大学 Multi-dimensional characteristic dynamic abnormal integral model based on quasi-Markov model
CN117272178A (en) * 2023-11-17 2023-12-22 北京易动宇航科技有限公司 Fault diagnosis method of electric propulsion system
CN117272178B (en) * 2023-11-17 2024-02-06 北京易动宇航科技有限公司 Fault diagnosis method of electric propulsion system

Also Published As

Publication number Publication date
CN111913859B (en) 2023-11-14

Similar Documents

Publication Publication Date Title
CN111913859B (en) Abnormal behavior detection method and device
CN108427725B (en) Data processing method, device and system
Hayashi et al. Fully dynamic betweenness centrality maintenance on massive networks
WO2019144728A1 (en) Data processing
CN114422267B (en) Flow detection method, device, equipment and medium
EP2585948A1 (en) High-dimensional stratified sampling
CN109450671B (en) Log multi-combination alarm classification method and system
US20060101039A1 (en) Method and apparatus to scale and unroll an incremental hash function
CN110807508B (en) Bus peak load prediction method considering complex weather influence
WO2015165230A1 (en) Social contact message monitoring method and device
CN114301758A (en) Alarm processing method, system, equipment and storage medium
CN114401145A (en) Network flow detection system and method
WO2018212929A1 (en) System and method for enabling related searches for live events in data streams
WO2019184325A1 (en) Community division quality evaluation method and system based on average mutual information
CN114461792A (en) Alarm event correlation method, device, electronic equipment, medium and program product
CN111459780B (en) User identification method and device, readable medium and electronic equipment
CN116822803B (en) Carbon emission data graph construction method, device and equipment based on intelligent algorithm
CN115408189A (en) Artificial intelligence and big data combined anomaly detection method and service system
CN115426161A (en) Abnormal device identification method, apparatus, device, medium, and program product
Lavrova et al. Detection of cyber threats to network infrastructure of digital production based on the methods of Big Data and multifractal analysis of traffic
CN114547491A (en) Time sequence map construction method, device, equipment and medium
CN113779335A (en) Information generation method and device, electronic equipment and computer readable medium
CN114004989A (en) Power safety early warning data clustering processing method based on improved K-means algorithm
CN112579673A (en) Multi-source data processing method and device
CN111814436A (en) User behavior sequence detection method and system based on mutual information and entropy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210319

Address after: 100176 8660, 6 / F, building 3, No.3, Yongchang North Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant after: BEIJING SKYGUARD NETWORK SECURITY TECHNOLOGY Co.,Ltd.

Applicant after: Chengdu sky guard Network Security Technology Co.,Ltd.

Address before: 100176 8660, 6 / F, building 3, No.3, Yongchang North Road, Beijing Economic and Technological Development Zone, Beijing

Applicant before: BEIJING SKYGUARD NETWORK SECURITY TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant