CN110222243B - Method, device and storage medium for determining abnormal behavior - Google Patents

Method, device and storage medium for determining abnormal behavior Download PDF

Info

Publication number
CN110222243B
CN110222243B CN201910447366.6A CN201910447366A CN110222243B CN 110222243 B CN110222243 B CN 110222243B CN 201910447366 A CN201910447366 A CN 201910447366A CN 110222243 B CN110222243 B CN 110222243B
Authority
CN
China
Prior art keywords
target
behavior sequence
user
operation information
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910447366.6A
Other languages
Chinese (zh)
Other versions
CN110222243A (en
Inventor
李加佳
司马云瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201910447366.6A priority Critical patent/CN110222243B/en
Publication of CN110222243A publication Critical patent/CN110222243A/en
Application granted granted Critical
Publication of CN110222243B publication Critical patent/CN110222243B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure relates to a method, apparatus, and storage medium for determining abnormal behavior, the method comprising: determining an abnormal second behavior sequence in the first behavior sequence according to a first behavior sequence corresponding to the abnormal sample user and a preset frequent path mining algorithm, acquiring the target times of the target operation information in the second behavior sequence within a preset time period, determining the first user number of the abnormal sample user matched with the second behavior sequence and the second user number of the normal sample user matched with the second behavior sequence according to the target operation information and the target times, and determining that the second behavior sequence is an abnormal behavior if the ratio of the first user number to the second user number is larger than or equal to a preset ratio threshold. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.

Description

Method, device and storage medium for determining abnormal behavior
Technical Field
The present disclosure relates to the field of network security, and in particular, to a method, an apparatus, and a storage medium for determining abnormal behavior.
Background
In the related art, with the continuous development and wide application of internet technology, people can conveniently and flexibly acquire and release various information through a network. The internet is convenient, and is easy to be attacked due to the characteristics of openness, dispersion and the like, so that inconvenience is brought to users, even economic loss is caused, and therefore, network security is more and more emphasized by people. For various attacks on the internet, abnormal behavior patterns of users with risks can be analyzed through abnormal behavior pattern mining, so that the users with the risks can be identified, the safety degree of the network is improved, the loss of the users is avoided, and the safety degree of the network is determined by the accuracy of identifying the abnormal behavior patterns.
Disclosure of Invention
To overcome the problems in the related art, the present disclosure provides a method, apparatus, and storage medium for determining abnormal behavior.
According to a first aspect of embodiments of the present disclosure, there is provided a method of determining abnormal behavior, the method comprising:
determining an abnormal second behavior sequence in the first behavior sequence according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, wherein the first behavior sequence comprises: operation information and operation time information corresponding to the operation information;
acquiring the target frequency of occurrence of target operation information in a preset time period in the second behavior sequence, wherein the target operation information is one kind of operation information selected in the second behavior sequence according to a preset rule;
determining a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence according to the target operation information and the target times;
and if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold, determining that the second behavior sequence is an abnormal behavior.
Optionally, the determining, according to the target operation information and the target times, a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence includes:
determining at least one target sample user matched with the second behavior sequence in the sample user set according to the target operation information and the target times;
determining the first number of users of abnormal sample users and the second number of users of normal sample users included in the at least one target sample user.
Optionally, the determining, according to the target operation information and the target times, at least one target sample user in the sample user set that matches the second behavior sequence includes:
if the second behavior sequence is a subsequence of a third behavior sequence corresponding to any sample user in the sample user set, and the number of times of occurrence of the target operation information and the target number of times in the third behavior sequence in the preset time period meet a first preset condition, determining that the any sample user is the target sample user;
the first preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times is greater than or equal to a first threshold value in the preset time period.
Optionally, the method further comprises:
after a target behavior sequence corresponding to a target user is obtained, if the second behavior sequence is a subsequence of the target behavior sequence and the number of times of occurrence of the target operation information in the preset time period in the target behavior sequence meet a second preset condition, determining that the target user is an abnormal user;
the second preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
According to a second aspect of embodiments of the present disclosure, there is provided an apparatus for determining abnormal behavior, the apparatus comprising:
a first determining module, configured to determine a second behavior sequence of the anomaly in a first behavior sequence according to the first behavior sequence corresponding to the anomaly sample user and a preset frequent path mining algorithm, where the first behavior sequence includes: operation information and operation time information corresponding to the operation information;
the acquisition module is configured to acquire a target frequency of occurrence of target operation information in the second behavior sequence within a preset time period, wherein the target operation information is one kind of operation information selected from the second behavior sequence according to a preset rule;
a second determining module configured to determine, according to the target operation information and the target times, a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence;
the judging module is configured to determine that the second behavior sequence is an abnormal behavior if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold.
Optionally, the second determining module includes:
a first determining submodule configured to determine, according to the target operation information and the target times, at least one target sample user in a sample user set that matches the second behavior sequence;
a second determination submodule configured to determine the first user number of abnormal sample users and the second user number of normal sample users included in the at least one target sample user.
Optionally, the first determining sub-module is configured to determine that any sample user is the target sample user if the second behavior sequence is a sub-sequence of a third behavior sequence corresponding to any sample user in the sample user set, and in the third behavior sequence, the number of times that the target operation information appears and the target number of times meet a first preset condition within the preset time period;
the first preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times is greater than or equal to a first threshold value in the preset time period.
Optionally, the apparatus further comprises:
a third determining module, configured to, after a target behavior sequence corresponding to a target user is obtained, determine that the target user is an abnormal user if the second behavior sequence is a subsequence of the target behavior sequence and the number of times that the target operation information appears in the target behavior sequence within the preset time period and the target number of times meet a second preset condition;
the second preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
According to a third aspect of embodiments of the present disclosure, there is provided an apparatus for determining abnormal behavior, the apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
determining an abnormal second behavior sequence in the first behavior sequence according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, wherein the first behavior sequence comprises: operation information and operation time information corresponding to the operation information;
acquiring the target frequency of occurrence of target operation information in a preset time period in the second behavior sequence, wherein the target operation information is one kind of operation information selected in the second behavior sequence according to a preset rule;
determining a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence according to the target operation information and the target times;
and if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold, determining that the second behavior sequence is an abnormal behavior.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of determining abnormal behavior provided by the first aspect of the present disclosure.
The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:
firstly, determining an abnormal second behavior sequence in a first behavior sequence according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, wherein the first behavior sequence comprises: and then, according to the target operation information and the target times, determining the first user number of abnormal sample users matched with the second behavior sequence and the second user number of normal sample users matched with the second behavior sequence, and if the ratio of the first user number to the second user number is larger than or equal to a preset proportional threshold, determining that the second behavior sequence is abnormal behavior. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart illustrating a method of determining abnormal behavior in accordance with an exemplary embodiment.
Fig. 2 is a flow chart illustrating one step 103 of the embodiment shown in fig. 1.
FIG. 3 is a flow chart illustrating another method of determining abnormal behavior in accordance with an exemplary embodiment.
FIG. 4 is a block diagram illustrating an apparatus to determine abnormal behavior in accordance with an exemplary embodiment.
FIG. 5 is a block diagram of a second determination module shown in the embodiment of FIG. 4.
Fig. 6 is a block diagram illustrating another apparatus for determining abnormal behavior in accordance with an example embodiment.
FIG. 7 is a block diagram illustrating an apparatus to determine abnormal behavior in accordance with an exemplary embodiment.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Before introducing the method, apparatus, and storage medium for determining abnormal behavior provided by the present disclosure, an application scenario involved in various embodiments of the present disclosure is first introduced. The application scenario may include: the server can provide data services for various service platforms or Application programs (APP), and users can access the server through the service platforms or the Application programs. The server may be a local server or a cloud server.
FIG. 1 is a flow chart illustrating a method of determining abnormal behavior, as shown in FIG. 1, according to an exemplary embodiment, including the steps of:
in step 101, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, determining an abnormal second behavior sequence in the first behavior sequence, where the first behavior sequence includes: the operation information and operation time information corresponding to the operation information.
For example, an abnormal sample user pre-stored on the server and a first behavior sequence corresponding to the abnormal sample user may be obtained from a database on the server. Each exception sample user may include account data (e.g., may be a user name or a user identification code), and the first sequence of actions may include operation information (the operation information may be various operations performed by the user, such as login, logout, password change, etc.) and operation time information corresponding to the operation information, e.g., an exception sample userSample user 1 chronologically ordered t1、t2、t3If the operations α, β, and γ are performed, the first behavior sequence corresponding to the abnormal sample user 1 is [ α (t) ]1),β(t2),γ(t3)]. And then, determining an abnormal second behavior sequence in the first behavior sequence according to a preset frequent path mining algorithm, wherein the frequent path mining algorithm can find out a plurality of subsequences which are shared by the first behavior sequence and correspond to the abnormal sample user, namely the second behavior sequence to be determined.
In step 102, a target frequency of occurrence of target operation information within a preset time period in the second behavior sequence is obtained, where the target operation information is one kind of operation information selected in the second behavior sequence according to a preset rule.
For example, after the second behavior sequence is determined, the target number of times that the target operation information appears in the second behavior sequence within the preset time period is obtained, and the target operation information may be any one of all operation information included in the second behavior sequence. The rule for selecting the target operation information may be, for example, empirically selected, or may be one operation information selected by the server in the second behavior sequence according to a preset rule. The preset rule may be, for example, selecting the operation information with the highest frequency of occurrence in the second behavior sequence. For example, the target operation information is login failure, and the preset time period is the latest 12 hours, then the target times are the times of login failure occurring in the latest 12 hours in the second behavior sequence.
In step 103, a first user number of abnormal sample users matching the second behavior sequence and a second user number of normal sample users matching the second behavior sequence are determined according to the target operation information and the target times.
In step 104, if the ratio of the first user number to the second user number is greater than or equal to the preset proportional threshold, it is determined that the second behavior sequence is an abnormal behavior.
For example, after the target times are obtained, the first user number of the abnormal sample users matched with the second behavior sequence and the second user number of the normal sample users matched with the second behavior sequence may be determined according to the target operation information and the target times. According to the target operation information and the target times, the manner of determining whether a certain sample user (abnormal sample user or normal sample user) is matched with the second behavior sequence may be: and determining whether the second behavior sequence and the behavior sequence corresponding to the sample user meet preset conditions, and when the preset conditions are met, determining that the sample user is matched with the second behavior sequence.
And then determining whether the second behavior sequence is abnormal behavior according to the first user number and the second user number, wherein the determining whether the second behavior sequence is abnormal behavior may be performed by comparing a ratio of the first user number to the second user number with a preset ratio threshold, determining that the second behavior sequence is abnormal behavior if the ratio of the first user number to the second user number is greater than or equal to the preset ratio threshold, and determining that the second behavior sequence is not abnormal behavior if the ratio of the first user number to the second user number is less than the preset ratio threshold. For example, if the preset ratio threshold is 4, the number of first users of the abnormal sample users matched with the second behavior sequence is 100, and the number of second users of the normal sample users matched with the second behavior sequence is 20, the ratio of the number of first users to the number of second users is greater than 4, and it is determined that the second behavior sequence is an abnormal behavior. It can be understood that when there are more normal sample users matching the second behavior sequence, that is, the operation information included in the second behavior sequence is not representative, and more normal sample users will also execute the operation information included in the second behavior sequence, then the second behavior sequence is not an abnormal behavior, so as to avoid misjudgment of normal users. When there are more abnormal sample users matching the second behavior sequence, that is, the operation information included in the second behavior sequence is representative, and more abnormal sample users may execute the operation information included in the second behavior sequence, the second behavior sequence is an abnormal behavior, thereby improving the accuracy of detection.
It should be noted that, in the prior art, if the second behavior sequence is determined only by the number of times of occurrence of the target operation information, the targets cannot be distinguishedAnd the second behavior sequence has the same times of appearance of the target operation information but different execution sequence of the operation information. According to the scheme, the second behavior sequence can be identified according to the execution sequence of the operation information in the behavior sequence through the frequent path mining algorithm, and the times of occurrence of the target operation information in the second behavior sequence are combined, so that the detection accuracy is improved. For example, two second behavior sequences (I) are obtained by a frequent path mining algorithm1,I2,I2,I2) And (I)2,I2,I2,I1) The target operation information is I2Two second sequence of behaviors I2The number of occurrences is also the same, but I2Are not in the same order of execution and thus the two second sequences of actions are not the same.
In summary, first, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, a second behavior sequence that is abnormal is determined in the first behavior sequence, where the first behavior sequence includes: and then, according to the target operation information and the target times, determining the first user number of abnormal sample users matched with the second behavior sequence and the second user number of normal sample users matched with the second behavior sequence, and if the ratio of the first user number to the second user number is larger than or equal to a preset proportional threshold, determining that the second behavior sequence is abnormal behavior. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.
It should be noted that, the implementation manner of the frequent path mining algorithm in step 101 may include:
for example, the number of the abnormal sample users is multiple, and there are multiple first behavior sequences correspondingly. And taking the plurality of first behavior sequences and a preset support threshold value as the input of the frequent path mining algorithm, and acquiring at least one second behavior sequence output by the frequent path mining algorithm. Wherein the second behavior sequence appears in the first behavior sequence with a frequency greater than or equal to the support threshold.
The frequent path mining algorithm is to perform layer-by-layer iteration on the plurality of first behavior sequences according to a preset support threshold to obtain a frequent item set meeting the preset support threshold, and select a frequent item set meeting a preset item number (which can be understood that the number of operation information included in the frequent item set is greater than or equal to the preset item number) from the obtained plurality of frequent item sets as at least one second behavior sequence. The frequent path mining algorithm may be, for example, Apriori (chinese: association rule) algorithm, GSP (Generalized Sequential Pattern in english) algorithm, and FreeSpan algorithm.
The support threshold may be preset, or may be flexibly adjusted according to specific requirements, when the support threshold is too low, the frequent path mining algorithm may output more second behavior sequences, which may easily cause erroneous judgment, and when the support threshold is too high, the frequent path mining algorithm may output less second behavior sequences, which may easily cause missed detection. Therefore, the support threshold can be set according to experience, and then adjusted according to the number of the second behavior sequences output by the frequent path mining algorithm. So that there are 4 first action sequences (I)1,I2,I3),(I1,I2),(I1) And (I)2,I3),I1, I2,I3Respectively corresponding to three different operation information, taking the preset support threshold value as 2 and the preset number of items as 2 (that is, the number of the operation information contained in the second behavior sequence is at least 2) as an example, taking 4 first behavior sequences and the preset support threshold value as the input of the frequent path mining algorithm, and obtaining a frequent item set as (I)1),(I2),(I3),(I1,I2) And (I)2,I3) Selecting a frequent item set with the preset number of 2 from the obtained multiple frequent item sets as a second behavior sequence, wherein the second behavior sequence output by the frequent path mining algorithm is (I)1,I2) And (I)2,I3)。
Fig. 2 is a flow chart illustrating one step 103 of the embodiment shown in fig. 1. As shown in fig. 2, step 103 includes the following steps:
in step 1031, at least one target sample user in the sample user set matched with the second behavior sequence is determined according to the target operation information and the target times.
In step 1032, a first user number of abnormal sample users and a second user number of normal sample users included in the at least one target sample user are determined.
Specifically, a sample user set (including a plurality of sample users) and a behavior sequence corresponding to each sample user in the sample user set may be stored in advance in a database on the server, and the plurality of sample users may be classified into normal sample users and abnormal sample users. After the target times are obtained, at least one target sample user matched with the second behavior sequence in the sample user set can be determined according to the target operation information and the target times. According to the target operation information and the target times, the manner of determining whether the target sample user is matched with the second behavior sequence may be: and determining whether the behavior sequence corresponding to any sample user and the second behavior sequence meet preset conditions, and when the preset conditions are met, determining that any sample user is matched with the second behavior sequence, namely the target sample user. And determining the number of the first users and the number of the second users according to the number of the abnormal sample users and the normal sample users included in the at least one target sample user.
Optionally, step 1031 may be implemented by:
and if the second behavior sequence is a subsequence of a third behavior sequence corresponding to any sample user in the sample user set, and the number of times of occurrence of the target operation information and the target number of times in the third behavior sequence in a preset time period meet a first preset condition, determining that any sample user is a target sample user.
The first preset condition is as follows: the number of times of occurrence of the target operation information in the preset time period is greater than or equal to the target number of times, or the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a first threshold.
For example, if the second behavior sequence is a subsequence of a third behavior sequence corresponding to any sample user in the sample user set (that is, the plurality of operation information included in the third behavior sequence includes not only each operation information in the second behavior sequence, but also possibly other operation information), and in the third behavior sequence, when the number of times of occurrence of the target operation information and the target number of times in the preset time period satisfy the first preset condition, it is determined that any sample user is the target sample user.
FIG. 3 is a flow chart illustrating another method of determining abnormal behavior in accordance with an exemplary embodiment. As shown in fig. 3, the method further comprises:
in step 105, after the target behavior sequence corresponding to the target user is obtained, if the second behavior sequence is a subsequence of the target behavior sequence and the number of times of occurrence of the target operation information in the target behavior sequence within the preset time period satisfy a second preset condition, it is determined that the target user is an abnormal user.
The second preset condition is as follows: the number of times of occurrence of the target operation information in the preset time period is greater than or equal to the target number of times, or the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
For example, when the second behavior sequence is determined to be abnormal behavior, the second behavior sequence, the target operation information and the target times may be stored in a database of the server as a basis for detecting an abnormal user. When the target user accesses the database, the operation information of the target user and the operation time information corresponding to the operation information are recorded so as to obtain a target behavior sequence corresponding to the target user. After the target behavior sequence corresponding to the target user is obtained, whether the second behavior sequence is a subsequence of the target behavior sequence is judged, and if the second behavior sequence is the subsequence of the target behavior sequence, whether the times of occurrence of the target operation information and the times of occurrence of the target operation information in the target behavior sequence within a preset time period meet a second preset condition is further judged. And if the second behavior sequence is a subsequence of the target behavior sequence, and the times of occurrence of the target operation information and the target times in the target behavior sequence in a preset time period meet a second preset condition, determining that the target user is an abnormal user.
For example, the second row sequence is (I)1,I2,I1,I1,I3,I1,I1,I1) The target operation information is I1Within 3 hours I1The number of occurrences (i.e., the target number) is 4, and the target row sequence corresponding to the target user is (I)1,I2,I1,I1,I3,I1,I1,I1,I1,I1) In the target behavioral sequence, within 3 hours I1The number of occurrences is 6, and the second preset condition is: the number of times of occurrence of the target operation information within a preset time period is greater than or equal to the target number of times. The second behavior sequence is a subsequence of the target behavior sequence, and in the target behavior sequence, I is within a preset time period (3 hours)1And if the occurrence frequency (6) is greater than the target frequency (4), the target user is an abnormal user.
Further, when the target user is determined to be an abnormal user, the server may perform authority control on the target user (for example, forcibly quit, input an authentication code, or have a browsing authority, no editing authority, or the like) to ensure the security of the server and other users accessing the server.
It should be noted that if only the number of times of occurrence of the target operation information is used as a basis for detecting the target user, the normal user may be mistakenly killed regardless of the occurrence sequence of the plurality of operation information included in the target behavior sequence. For example, the target behavior sequence 1 corresponding to the target user 1 includes I1,I2, I2,I2The target behavior sequence 2 corresponding to the target user 2 includes I2,I2,I2,I1. Wherein, the operation information I1Information I indicating successful operation of login2Indicating a login failure operation, I1And I2The corresponding operation time information is all at the latestWithin 1 day, the appearance sequence of the operations in the target behavior sequence corresponds to the execution sequence of the operations, that is, the execution sequence of the target user 1 is: i is1,I2,I2,I2The execution sequence of the target user 2 is: i is2,I2,I2,I1. In the prior art, if the abnormal user is detected only by the condition that the occurrence frequency of the target operation information meets the preset condition, when the target operation information is I2When the preset condition is I in the last 1 day2When the number of occurrences is greater than or equal to 3, both the target user 1 and the target user 2 are regarded as abnormal users. In a real-world scenario, target user 1 may be normal (e.g., after target user 1 successfully logs in, the password is lost when logging in again), and target user 2 may be abnormal (e.g., database crash behavior). The scheme determines whether the user behavior is abnormal or not by combining the occurrence sequence of the user behavior and the occurrence frequency of the target operation information, and can determine the (I) when determining the second behavior sequence2,I2, I2,I1) And (I)1,I2,I2,I2) As two different second behavior sequences, combining the two second behavior sequences, determining the times of the target operation information in a preset time period, determining the sample users matched with the two second behavior sequences in the sample user set, and determining (I) according to the first user number and the second user number2,I2,I2,I1) For abnormal behavior, (I)1,I2,I2,I2) The behavior is not abnormal, so that the target user 1 can be judged to be a normal user, and the target user 2 can be judged to be an abnormal user, so that the probability of killing the normal user by mistake is reduced.
In summary, first, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, a second behavior sequence that is abnormal is determined in the first behavior sequence, where the first behavior sequence includes: and then, according to the target operation information and the target times, determining the first user number of abnormal sample users matched with the second behavior sequence and the second user number of normal sample users matched with the second behavior sequence, and if the ratio of the first user number to the second user number is larger than or equal to a preset proportional threshold, determining that the second behavior sequence is abnormal behavior. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.
FIG. 4 is a block diagram illustrating an apparatus to determine abnormal behavior in accordance with an exemplary embodiment. As shown in fig. 4, the apparatus 200 includes:
a first determining module 201, configured to determine, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, a second behavior sequence of an abnormality in the first behavior sequence, where the first behavior sequence includes: the operation information and operation time information corresponding to the operation information.
The obtaining module 202 is configured to obtain a target number of times that target operation information appears in the second behavior sequence within a preset time period, where the target operation information is one type of operation information selected in the second behavior sequence according to a preset rule.
And the second determining module 203 is configured to determine a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence according to the target operation information and the target times.
The determining module 204 is configured to determine that the second behavior sequence is an abnormal behavior if a ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold.
FIG. 5 is a block diagram of a second determination module shown in the embodiment of FIG. 4. As shown in fig. 5, the second determining module 203 includes:
the first determining sub-module 2031 is configured to determine, according to the target operation information and the target times, at least one target sample user in the sample user set that matches the second behavior sequence.
The second determining sub-module 2032 is configured to determine a first user number of abnormal sample users and a second user number of normal sample users, which are included in the at least one target sample user.
Optionally, the first determining sub-module 2031 is configured to determine that any sample user is the target sample user if the second behavior sequence is a sub-sequence of a third behavior sequence corresponding to any sample user in the sample user set, and in the third behavior sequence, the number of times of occurrence of the target operation information in a preset time period and the target number of times meet a first preset condition.
The first preset condition is as follows: the number of times of occurrence of the target operation information in the preset time period is greater than or equal to the target number of times, or the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a first threshold.
Fig. 6 is a block diagram illustrating another apparatus for determining abnormal behavior in accordance with an example embodiment. As shown in fig. 6, the apparatus 200 further includes:
the third determining module 205 is configured to, after the target behavior sequence corresponding to the target user is obtained, determine that the target user is an abnormal user if the second behavior sequence is a subsequence of the target behavior sequence and the number of times of occurrence of the target operation information in the target behavior sequence within a preset time period satisfy a second preset condition.
The second preset condition is as follows: the number of times of occurrence of the target operation information in the preset time period is greater than or equal to the target number of times, or the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
In summary, first, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, a second behavior sequence that is abnormal is determined in the first behavior sequence, where the first behavior sequence includes: and then, according to the target operation information and the target times, determining the first user number of abnormal sample users matched with the second behavior sequence and the second user number of normal sample users matched with the second behavior sequence, and if the ratio of the first user number to the second user number is larger than or equal to a preset proportional threshold, determining that the second behavior sequence is abnormal behavior. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.
FIG. 7 is a block diagram illustrating an apparatus to determine abnormal behavior in accordance with an exemplary embodiment. For example, the apparatus 300 may be provided as a server. Referring to FIG. 7, apparatus 300 includes a processing component 322 that further includes one or more processors and memory resources, represented by memory 332, for storing instructions, such as applications, that are executable by processing component 322. The application programs stored in memory 332 may include one or more modules that each correspond to a set of instructions. Further, the processing component 322 is configured to execute instructions to perform the above-described method of determining anomalous behavior.
The apparatus 300 may also include a power component 326 configured to perform power management of the apparatus 300, a wired or wireless network interface 350 configured to connect the apparatus 300 to a network, and an input/output (I/O) interface 358. The apparatus 300 may operate based on an operating system stored in the memory 332, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
The present disclosure also provides a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of determining abnormal behavior provided by the present disclosure.
In summary, first, according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, a second behavior sequence that is abnormal is determined in the first behavior sequence, where the first behavior sequence includes: and then, according to the target operation information and the target times, determining the first user number of abnormal sample users matched with the second behavior sequence and the second user number of normal sample users matched with the second behavior sequence, and if the ratio of the first user number to the second user number is larger than or equal to a preset proportional threshold, determining that the second behavior sequence is abnormal behavior. According to the method and the device, whether the user behaviors are abnormal or not is determined by combining the occurrence sequence of the user behaviors and the occurrence frequency in the preset time period, the accuracy of detecting abnormal users can be improved, and misjudgment of normal users is effectively avoided.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method of determining abnormal behavior, the method comprising:
determining an abnormal second behavior sequence in the first behavior sequence according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, wherein the first behavior sequence comprises: operation information and operation time information corresponding to the operation information;
acquiring the target frequency of occurrence of target operation information in a preset time period in the second behavior sequence, wherein the target operation information is one kind of operation information selected in the second behavior sequence according to a preset rule;
determining a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence according to the target operation information and the target times;
and if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold, determining that the second behavior sequence is an abnormal behavior.
2. The method of claim 1, wherein determining a first number of users of abnormal sample users matching the second behavior sequence and a second number of users of normal sample users matching the second behavior sequence according to the target operation information and the target times comprises:
determining at least one target sample user matched with the second behavior sequence in the sample user set according to the target operation information and the target times;
determining the first number of users of abnormal sample users and the second number of users of normal sample users included in the at least one target sample user.
3. The method of claim 2, wherein the determining at least one target sample user in the sample user set matching the second behavior sequence according to the target operation information and the target times comprises:
if the second behavior sequence is a subsequence of a third behavior sequence corresponding to any sample user in the sample user set, and the number of times of occurrence of the target operation information and the target number of times in the third behavior sequence in the preset time period meet a first preset condition, determining that the any sample user is the target sample user;
the first preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times is greater than or equal to a first threshold value in the preset time period.
4. The method of claim 1, further comprising:
under the condition that a second behavior sequence is determined to be an abnormal behavior, after a target behavior sequence corresponding to a target user is obtained, if the second behavior sequence is a subsequence of the target behavior sequence and the number of times of occurrence of the target operation information in the preset time period in the target behavior sequence meet a second preset condition, determining that the target user is an abnormal user;
the second preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
5. An apparatus for determining abnormal behavior, the apparatus comprising:
a first determining module, configured to determine a second behavior sequence of the anomaly in a first behavior sequence according to the first behavior sequence corresponding to the anomaly sample user and a preset frequent path mining algorithm, where the first behavior sequence includes: operation information and operation time information corresponding to the operation information;
the acquisition module is configured to acquire a target frequency of occurrence of target operation information in the second behavior sequence within a preset time period, wherein the target operation information is one kind of operation information selected from the second behavior sequence according to a preset rule;
a second determining module configured to determine, according to the target operation information and the target times, a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence;
the judging module is configured to determine that the second behavior sequence is an abnormal behavior if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold.
6. The apparatus of claim 5, wherein the second determining module comprises:
a first determining submodule configured to determine, according to the target operation information and the target times, at least one target sample user in a sample user set that matches the second behavior sequence;
a second determination submodule configured to determine the first user number of abnormal sample users and the second user number of normal sample users included in the at least one target sample user.
7. The apparatus according to claim 6, wherein the first determining sub-module is configured to determine that any sample user is the target sample user if the second behavior sequence is a sub-sequence of a third behavior sequence corresponding to any sample user in the sample user set, and in the third behavior sequence, the number of times of occurrence of the target operation information and the target number of times within the preset time period satisfy a first preset condition;
the first preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times is greater than or equal to a first threshold value in the preset time period.
8. The apparatus of claim 5, further comprising:
a third determining module, configured to, when it is determined that a second behavior sequence is an abnormal behavior, after a target behavior sequence corresponding to a target user is obtained, determine that the target user is an abnormal user if the second behavior sequence is a subsequence of the target behavior sequence and the number of times of occurrence of the target operation information and the number of times of the target operation information in the target behavior sequence within the preset time period satisfy a second preset condition;
the second preset condition is as follows: the frequency of occurrence of the target operation information in the preset time period is greater than or equal to the target frequency; or, the ratio of the number of times of occurrence of the target operation information to the target number of times in the preset time period is greater than or equal to a second threshold.
9. An apparatus for determining abnormal behavior, the apparatus comprising:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to:
determining an abnormal second behavior sequence in the first behavior sequence according to a first behavior sequence corresponding to an abnormal sample user and a preset frequent path mining algorithm, wherein the first behavior sequence comprises: operation information and operation time information corresponding to the operation information;
acquiring the target frequency of occurrence of target operation information in a preset time period in the second behavior sequence, wherein the target operation information is one kind of operation information selected in the second behavior sequence according to a preset rule;
determining a first user number of abnormal sample users matched with the second behavior sequence and a second user number of normal sample users matched with the second behavior sequence according to the target operation information and the target times;
and if the ratio of the first user number to the second user number is greater than or equal to a preset proportional threshold, determining that the second behavior sequence is an abnormal behavior.
10. A computer-readable storage medium, on which computer program instructions are stored, which program instructions, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 4.
CN201910447366.6A 2019-05-27 2019-05-27 Method, device and storage medium for determining abnormal behavior Active CN110222243B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910447366.6A CN110222243B (en) 2019-05-27 2019-05-27 Method, device and storage medium for determining abnormal behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910447366.6A CN110222243B (en) 2019-05-27 2019-05-27 Method, device and storage medium for determining abnormal behavior

Publications (2)

Publication Number Publication Date
CN110222243A CN110222243A (en) 2019-09-10
CN110222243B true CN110222243B (en) 2021-08-31

Family

ID=67818428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910447366.6A Active CN110222243B (en) 2019-05-27 2019-05-27 Method, device and storage medium for determining abnormal behavior

Country Status (1)

Country Link
CN (1) CN110222243B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110609783B (en) * 2019-09-24 2023-08-04 京东科技控股股份有限公司 Method and device for identifying abnormal behavior user
CN111459797B (en) * 2020-02-27 2023-04-28 上海交通大学 Abnormality detection method, system and medium for developer behavior in open source community
CN113726814B (en) * 2021-09-09 2022-09-02 中国电信股份有限公司 User abnormal behavior identification method, device, equipment and storage medium
CN117614724A (en) * 2023-12-06 2024-02-27 北京东方通科技股份有限公司 Industrial Internet access control method based on system fine granularity processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615936A (en) * 2015-03-04 2015-05-13 哈尔滨工业大学 Behavior monitoring method for VMM (virtual machine monitor) layer of cloud platform
CN105187242A (en) * 2015-08-20 2015-12-23 中国人民解放军国防科学技术大学 Method for detecting abnormal user behaviours mined on the basis of variable-length sequence mode
CN108021932A (en) * 2017-11-22 2018-05-11 北京奇虎科技有限公司 Data detection method, device and electronic equipment
CN108055281A (en) * 2017-12-27 2018-05-18 百度在线网络技术(北京)有限公司 Account method for detecting abnormality, device, server and storage medium
CN108156166A (en) * 2017-12-29 2018-06-12 百度在线网络技术(北京)有限公司 Abnormal access identification and connection control method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2689816C2 (en) * 2017-11-21 2019-05-29 ООО "Группа АйБи" Method for classifying sequence of user actions (embodiments)

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615936A (en) * 2015-03-04 2015-05-13 哈尔滨工业大学 Behavior monitoring method for VMM (virtual machine monitor) layer of cloud platform
CN105187242A (en) * 2015-08-20 2015-12-23 中国人民解放军国防科学技术大学 Method for detecting abnormal user behaviours mined on the basis of variable-length sequence mode
CN108021932A (en) * 2017-11-22 2018-05-11 北京奇虎科技有限公司 Data detection method, device and electronic equipment
CN108055281A (en) * 2017-12-27 2018-05-18 百度在线网络技术(北京)有限公司 Account method for detecting abnormality, device, server and storage medium
CN108156166A (en) * 2017-12-29 2018-06-12 百度在线网络技术(北京)有限公司 Abnormal access identification and connection control method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于隐马尔可夫模型的IDS异常检测新方法;田新广;《信号处理》;20031031;全文 *

Also Published As

Publication number Publication date
CN110222243A (en) 2019-09-10

Similar Documents

Publication Publication Date Title
CN110222243B (en) Method, device and storage medium for determining abnormal behavior
EP3497609B1 (en) Detecting scripted or otherwise anomalous interactions with social media platform
US10243982B2 (en) Log analyzing device, attack detecting device, attack detection method, and program
US10904286B1 (en) Detection of phishing attacks using similarity analysis
CN107851156B (en) Analysis method, analysis device, and recording medium
EP2790122A2 (en) System and method for correcting antivirus records to minimize false malware detections
CN107438049B (en) Malicious login identification method and device
CN108256322B (en) Security testing method and device, computer equipment and storage medium
WO2017040957A1 (en) Process launch, monitoring and execution control
US20170149800A1 (en) System and method for information security management based on application level log analysis
CN113992340B (en) User abnormal behavior identification method, device, equipment and storage medium
WO2016130374A1 (en) Method and apparatus for assigning device fingerprints to internet devices
CN108280346A (en) A kind of application protecting, monitoring method, apparatus and system
WO2020210976A1 (en) System and method for detecting anomaly
US10587629B1 (en) Reducing false positives in bot detection
CN111177725A (en) Method, device, equipment and storage medium for detecting malicious click operation
CN114511756A (en) Attack method and device based on genetic algorithm and computer program product
US20170344461A1 (en) Automated exception resolution during a software development session based on previous exception encounters
WO2019215478A1 (en) A system and a method for sequential anomaly revealing in a computer network
CN111104670B (en) APT attack identification and protection method
CN115630373B (en) Cloud service security analysis method, monitoring equipment and analysis system
CN111190813B (en) Android application network behavior information extraction system and method based on automatic testing
CN115422555B (en) Back door program detection method and device, electronic equipment and storage medium
CN117235686B (en) Data protection method, device and equipment
US11983249B2 (en) Error determination apparatus, error determination method and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant