CN112861891B - User behavior abnormality detection method and device - Google Patents
User behavior abnormality detection method and device Download PDFInfo
- Publication number
- CN112861891B CN112861891B CN201911178056.5A CN201911178056A CN112861891B CN 112861891 B CN112861891 B CN 112861891B CN 201911178056 A CN201911178056 A CN 201911178056A CN 112861891 B CN112861891 B CN 112861891B
- Authority
- CN
- China
- Prior art keywords
- behavior
- value
- behavior feature
- feature vectors
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 33
- 230000005856 abnormality Effects 0.000 title claims description 13
- 230000006399 behavior Effects 0.000 claims abstract description 208
- 239000013598 vector Substances 0.000 claims abstract description 159
- 206010000117 Abnormal behaviour Diseases 0.000 claims abstract description 37
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 34
- 238000000034 method Methods 0.000 claims abstract description 21
- 238000012163 sequencing technique Methods 0.000 claims abstract description 12
- 238000004590 computer program Methods 0.000 claims description 10
- 238000003860 storage Methods 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000010586 diagram Methods 0.000 abstract description 17
- 230000002159 abnormal effect Effects 0.000 abstract description 10
- 230000003542 behavioural effect Effects 0.000 description 6
- 230000006870 function Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000000737 periodic effect Effects 0.000 description 2
- 230000016571 aggressive behavior Effects 0.000 description 1
- 230000002547 anomalous effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9024—Graphs; Linked lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0635—Risk analysis of enterprise or organisation activities
Abstract
The invention discloses a method and a device for detecting abnormal user behaviors, and relates to the field of safety. The method comprises the following steps: analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples; calculating a similarity value between the behavior feature vectors of every two samples; each calculated similarity value is used as a weight value between corresponding behavior feature vectors of the PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and determining abnormal behaviors according to PR value sequencing results. The abnormal value detection is converted into the link relation importance ordering problem among the feature vectors, and the link diagrams among users or among the behaviors of the users are constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Description
Technical Field
The disclosure relates to the field of security, and in particular relates to a method and a device for detecting abnormal user behaviors.
Background
According to Verizon 2018 statistics, 28% of data leakage is caused by internal threats, and how to analyze operation behavior logs of internal users of enterprises to find anomalies and threats becomes a research hotspot in recent years.
Disclosure of Invention
The technical problem to be solved by the present disclosure is to provide a method and an apparatus for detecting abnormal behavior of a user, which can improve the accuracy of abnormal behavior detection.
According to an aspect of the present disclosure, a method for detecting abnormal user behavior is provided, including: analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples; calculating a similarity value between the behavior feature vectors of every two samples; each calculated similarity value is used as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and determining abnormal behaviors according to PR value sequencing results.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is the similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1.
In some embodiments, the behavior feature vectors corresponding to the plurality of samples include behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
In some embodiments, wherein determining the abnormal behavior based on the PR value ordering result comprises: and taking the user behavior corresponding to the behavior feature vector with the PR value smaller than the first threshold value in the PR value sequencing result as the abnormal behavior.
In some embodiments, determining the anomalous behavior from the PR value ordering result comprises: acquiring a behavior feature vector with PR value larger than a second threshold value in the PR value sequencing result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
According to another aspect of the present disclosure, there is also provided a user behavior abnormality detection apparatus including: the behavior feature vector acquisition unit is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples; a similarity value calculation unit configured to calculate a similarity value between behavior feature vectors of each two samples; a PR value ordering unit configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and an abnormal behavior recognition unit configured to determine an abnormal behavior according to the PR value ordering result.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is the similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1.
In some embodiments, the behavior feature vectors corresponding to the plurality of samples include behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
According to another aspect of the present disclosure, there is also provided a user behavior abnormality detection apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the user behavior anomaly detection method as described above based on instructions stored in the memory.
According to another aspect of the disclosure, there is also provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described user behavior anomaly detection method.
Compared with the prior art, the method and the device for detecting the abnormal behavior of the mobile terminal have the advantages that based on the PR algorithm principle, abnormal value detection is converted into the problem of ordering the importance of the link relation between feature vectors, and the link diagram between users or between behaviors of the users is constructed, so that users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Other features of the present disclosure and its advantages will become apparent from the following detailed description of exemplary embodiments of the disclosure, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The disclosure may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:
fig. 1 is a flow diagram of some embodiments of a user behavior anomaly detection method of the present disclosure.
Fig. 2 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
Fig. 3 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
Fig. 4 is a schematic structural diagram of some embodiments of a user behavior anomaly detection device of the present disclosure.
Fig. 5 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure.
Fig. 6 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but should be considered part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
Fig. 1 is a flow diagram of some embodiments of a user behavior anomaly detection method of the present disclosure.
In step 110, the user behavior log is analyzed to obtain behavior feature vectors corresponding to the plurality of samples. The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
For example, user behavior data is collected from a log, feature engineering is performed, and different user behaviors or periodic behaviors of the same user are constructed into feature vectors of m×n-dimensional feature space, where n is a feature number and m is a sample number.
In step 120, a similarity value between the behavioral characteristic vectors of each two samples is calculated.
For example, a cosine similarity value between every two behavior feature vectors, or a distance similarity, or the like is calculated.
At step 130, each calculated similarity value is used as a weight value between corresponding behavior feature vectors of PR (PageRank) algorithm.
At step 140, an iterative calculation of the PR algorithm is performed, generating a PR value ordering result for the behavioral feature vector. The initial value of the PR value of the behavioral characteristic vector of each sample in the PR algorithm is the same, for example, the initial value of the PR value corresponding to the behavioral characteristic vector of each sample is 1/m.
Steps 130 and 140 rank the importance of the link relationships between feature vectors.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is a similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1 respectively.
In step 150, abnormal behavior is determined based on the PR value ordering result.
In some embodiments, the user behavior corresponding to the behavior feature vector with the PR value less than the first threshold is taken as the abnormal behavior.
In some embodiments, a behavior feature vector with a PR value greater than a second threshold may also be obtained from the PR value ordering result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
For example, a user may have previously used browsing and downloading information normally, a month account was stolen or the user is ready to leave, a small number of queries and downloads sensitive data are continually performed daily for the month, the month user behavior may be more regular than before, and the similarity is higher, resulting in the user behavior being ranked more front in the PR value ordering result, and therefore, it is necessary to intercept behavior feature vectors with PR values greater than a second threshold and durations greater than a time threshold. This process can detect low frequency persistent aggression that appears to be free of anomalies.
In the embodiment, based on the PR algorithm principle, the abnormal value detection is converted into the link relation importance ordering problem among feature vectors, and the link diagram among users or among user behaviors is constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Fig. 2 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
At step 210, user behavior log data is collected. The user behavior log data includes authentication system logs, database logs, web access logs, and the like. The behavior log data may be subjected to preprocessing such as cleaning, conversion, association, missing values, and the like.
In step 220, the user behavior log is analyzed to generate behavior feature vectors for a plurality of users.
In step 230, a cosine similarity value between each two behavior feature vectors is calculated.
In some embodiments, cosine similarity values are normalized to [0,1]Wherein 0 represents dissimilarity and 1 represents similarity. For example, for behavior feature vectors A and B, then according to the formulaAnd calculating cosine similarity between the behavior feature vectors A and B.
At step 240, the formula is followedIteratively calculating PR values of the behavioral characteristic vector, wherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of user i t For PR value corresponding to the t-th time behavior feature vector of user j, S_ij is similarity value between the behavior feature vector of user i and the behavior feature vector of user j, m is user number, i is not equal to j, t, i and j are positive integers greater than or equal to 1, and PR value corresponding to the behavior feature vector of each user is initialized to be 1/m. Since the PR algorithm in this embodiment is a full-communication network, the iteration result eventually converges.
At step 250, the behavior feature vectors are ordered from large to small by PR value. The PR value of each user behavior feature vector is a similarity importance value.
In some embodiments, the behavior feature vectors may also be ordered from small to large in PR value.
In step 260, the user corresponding to the behavior feature vector with the PR value smaller than the first threshold value in the PR value sorting result is used as the user with abnormal behavior.
Namely, K values in the sequence are intercepted from back to front as abnormal values, and the user corresponding to the abnormal values is the user with abnormal behaviors.
In the above embodiment, after the behavior feature vectors of a plurality of users are extracted, cosine similarity between the feature vectors is calculated, and the similarity value is used as a weight value in a PR algorithm, PR value ranking calculation is performed on the characteristics, and K feature vectors are intercepted from the ranking according to a threshold value and used as anomalies, so that the method can be suitable for abnormal individual behavior detection in group users.
Fig. 3 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
At step 310, user behavior log data is collected.
In step 320, a user behavior log is analyzed to generate behavior feature vectors for multiple points in time for the same user.
In step 330, the cosine similarity between the behavior feature vectors is calculated to obtain the similarity value of behaviors between different time points of the same user.
In step 340, the formula is followedIteratively calculating PR values of the behavioral characteristic vector, wherein PR_i t+1 PR value corresponding to t+1st time behavior feature vector of time point i, PR_j t For PR value corresponding to the t-th time behavior feature vector of the time point j, S_ij is the similarity value between the behavior feature vector of the time point i and the behavior feature vector of the time point j, m is the number of the acquired time points, i is not less than j, t, i and j are positive integers greater than or equal to 1, and PR value of the behavior feature vector corresponding to each time point is initialized to be 1/m.
At step 350, the behavior feature vectors are ordered from large to small by PR value. The PR value of each user behavior feature vector is a similarity importance value.
In step 360, a time point corresponding to the behavior feature vector with the PR value smaller than the first threshold in the PR value sorting result is used as a time point of the abnormal behavior of the user.
In the above embodiment, after the behavior feature vectors of multiple time points of the same user are extracted, cosine similarity between the feature vectors is calculated, and the similarity value is used as a weight value in a PR algorithm, PR value ranking calculation is performed on the characteristics, and K feature vectors are intercepted from the ranking according to a threshold value and used as anomalies, so that the method is applicable to anomaly behavior detection of the same user in a certain time period.
Fig. 4 is a schematic structural diagram of some embodiments of a user behavior anomaly detection device of the present disclosure. The apparatus includes a behavior feature vector acquisition unit 410, a similarity value calculation unit 420, a PR value ordering unit 430, and an abnormal behavior recognition unit 440.
The behavior feature vector acquisition unit 410 is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples.
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
For example, user behavior data is collected from a log, feature engineering is performed, and different user behaviors or periodic behaviors of the same user are constructed into feature vectors of m×n-dimensional feature space, where n is a feature number and m is a sample number.
The similarity value calculation unit 420 is configured to calculate a similarity value between the behavior feature vectors of each two samples.
For example, a similarity value between behavior feature vectors of every two samples is calculated based on a cosine similarity algorithm.
The PR value ordering unit 430 is configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of the web page ranking PR algorithm; and performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is a similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1 respectively.
If the behavior feature vector corresponding to the plurality of samples is the behavior feature vector corresponding to the plurality of users, PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of user i t For PR value corresponding to the t-th time behavior feature vector of user j, S_ij is similarity value between the behavior feature vector of user i and the behavior feature vector of user j, m is user number, i is not equal to j, t, i and j are positive integers greater than or equal to 1, and PR value corresponding to the behavior feature vector of each user is initialized to be 1/m.
If the behavior feature vector corresponding to the plurality of samples is the behavior feature vector corresponding to the plurality of time points of the same user, PR_i t+1 PR value corresponding to t+1st time behavior feature vector of time point i, PR_j t For PR value corresponding to the t-th time behavior feature vector of the time point j, S_ij is the similarity value between the behavior feature vector of the time point i and the behavior feature vector of the time point j, m is the number of the acquired time points, i is not less than j, t, i and j are positive integers greater than or equal to 1, and PR value of the behavior feature vector corresponding to each time point is initialized to be 1/m.
The abnormal behavior recognition unit 440 is configured to determine an abnormal behavior according to the PR value ordering result.
In some embodiments, the user behavior corresponding to the behavior feature vector with the PR value less than the first threshold is taken as the abnormal behavior.
In some embodiments, a behavior feature vector with a PR value greater than a second threshold may also be obtained from the PR value ordering result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
In the embodiment, based on the PR algorithm principle, the abnormal value detection is converted into the link relation importance ordering problem among feature vectors, and the link diagram among users or among user behaviors is constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Fig. 5 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure. The device comprises: memory 510 and processor 520, wherein: memory 510 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory is used to store instructions in the embodiments corresponding to figures 1-3. Processor 520 is coupled to memory 510 and may be implemented as one or more integrated circuits, such as a microprocessor or microcontroller. The processor 520 is configured to execute instructions stored in the memory.
In some embodiments, as also shown in FIG. 6, the apparatus 600 includes a memory 610 and a processor 620. Processor 620 is coupled to memory 610 through BUS 630. The device 600 may also be coupled to external storage 650 via a storage interface 640 for invoking external data, and may also be coupled to a network or another computer system (not shown) via a network interface 660, not described in detail herein.
In this embodiment, the data instruction is stored by the memory, and then the processor processes the instruction, so that the accuracy of abnormal behavior detection is improved.
In other embodiments, a computer readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of the corresponding embodiments of fig. 1-3. It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. In order to avoid obscuring the concepts of the present disclosure, some details known in the art are not described. How to implement the solutions disclosed herein will be fully apparent to those skilled in the art from the above description.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the disclosure. The scope of the present disclosure is defined by the appended claims.
Claims (8)
1. A method for detecting user behavior anomalies, comprising:
analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples;
calculating a similarity value between the behavior feature vectors of every two samples;
each calculated similarity value is used as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm;
performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same, and the PR algorithm formula is as followsWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to t-th time behavior feature vector of sample j, S_ij is similarity value between behavior feature vector of sample i and behavior feature vector of sample j, m is sample number, i is not equal to j, t, i and j are positive integers greater than or equal to 1;
and determining abnormal behaviors according to the PR value sequencing result.
2. The user behavior abnormality detection method according to claim 1, wherein,
the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of users;
or alternatively
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
3. The user behavior abnormality detection method according to claim 1 or 2, wherein determining an abnormal behavior according to the PR value ordering result includes:
and taking the user behaviors corresponding to the behavior feature vectors with PR values smaller than the first threshold in the PR value sequencing result as abnormal behaviors.
4. The user behavior abnormality detection method according to claim 1 or 2, wherein determining an abnormal behavior according to the PR value ordering result includes:
acquiring a behavior feature vector with PR value larger than a second threshold value in the PR value sorting result;
and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
5. A user behavior abnormality detection apparatus comprising:
the behavior feature vector acquisition unit is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples;
a similarity value calculation unit configured to calculate a similarity value between behavior feature vectors of each two samples;
a PR value ordering unit configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same, and the PR algorithm formula is as followsWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of sample j, S_ij is the similarity value between the behavior feature vector of sample i and the behavior feature vector of sample j, m is the number of samples, i is not equal to j, t, i, j isA positive integer of 1 or more;
and an abnormal behavior recognition unit configured to determine an abnormal behavior according to the PR value ordering result.
6. The user behavior abnormality detection device according to claim 5, wherein,
the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of users;
or alternatively
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
7. A user behavior abnormality detection apparatus comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the user behavior anomaly detection method of any one of claims 1 to 4 based on instructions stored in the memory.
8. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the user behaviour anomaly detection method of any one of claims 1 to 4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911178056.5A CN112861891B (en) | 2019-11-27 | 2019-11-27 | User behavior abnormality detection method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911178056.5A CN112861891B (en) | 2019-11-27 | 2019-11-27 | User behavior abnormality detection method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112861891A CN112861891A (en) | 2021-05-28 |
CN112861891B true CN112861891B (en) | 2023-11-28 |
Family
ID=75985477
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911178056.5A Active CN112861891B (en) | 2019-11-27 | 2019-11-27 | User behavior abnormality detection method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112861891B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117744076B (en) * | 2024-02-06 | 2024-04-16 | 江苏开博科技有限公司 | Bank database system intrusion detection method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731914A (en) * | 2015-03-24 | 2015-06-24 | 浪潮集团有限公司 | Method for detecting user abnormal behavior based on behavior similarity |
CN105183784A (en) * | 2015-08-14 | 2015-12-23 | 天津大学 | Content based junk webpage detecting method and detecting apparatus thereof |
CN105847302A (en) * | 2016-05-31 | 2016-08-10 | 北京奇艺世纪科技有限公司 | Abnormity detection method and device |
CN107992738A (en) * | 2017-11-16 | 2018-05-04 | 北京奇艺世纪科技有限公司 | A kind of account logs in method for detecting abnormality, device and electronic equipment |
CN108595655A (en) * | 2018-04-27 | 2018-09-28 | 福建师范大学 | A kind of abnormal user detection method of dialogue-based characteristic similarity fuzzy clustering |
CN110297714A (en) * | 2019-06-19 | 2019-10-01 | 上海冰鉴信息科技有限公司 | The method and device of PageRank is obtained based on large-scale graph data collection |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10129274B2 (en) * | 2016-09-22 | 2018-11-13 | Adobe Systems Incorporated | Identifying significant anomalous segments of a metrics dataset |
EP3477906B1 (en) * | 2017-10-26 | 2021-03-31 | Accenture Global Solutions Limited | Systems and methods for identifying and mitigating outlier network activity |
-
2019
- 2019-11-27 CN CN201911178056.5A patent/CN112861891B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104731914A (en) * | 2015-03-24 | 2015-06-24 | 浪潮集团有限公司 | Method for detecting user abnormal behavior based on behavior similarity |
CN105183784A (en) * | 2015-08-14 | 2015-12-23 | 天津大学 | Content based junk webpage detecting method and detecting apparatus thereof |
CN105847302A (en) * | 2016-05-31 | 2016-08-10 | 北京奇艺世纪科技有限公司 | Abnormity detection method and device |
CN107992738A (en) * | 2017-11-16 | 2018-05-04 | 北京奇艺世纪科技有限公司 | A kind of account logs in method for detecting abnormality, device and electronic equipment |
CN108595655A (en) * | 2018-04-27 | 2018-09-28 | 福建师范大学 | A kind of abnormal user detection method of dialogue-based characteristic similarity fuzzy clustering |
CN110297714A (en) * | 2019-06-19 | 2019-10-01 | 上海冰鉴信息科技有限公司 | The method and device of PageRank is obtained based on large-scale graph data collection |
Also Published As
Publication number | Publication date |
---|---|
CN112861891A (en) | 2021-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10452725B2 (en) | Web page recognizing method and apparatus | |
CN113255370B (en) | Industry type recommendation method, device, equipment and medium based on semantic similarity | |
CN103577756A (en) | Virus detection method and device based on script type judgment | |
CN112328805A (en) | Entity mapping method of vulnerability description information and database table based on NLP | |
US20190050672A1 (en) | INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS | |
CN112861891B (en) | User behavior abnormality detection method and device | |
WO2017095439A1 (en) | Incremental clustering of a data stream via an orthogonal transform based indexing | |
CN110083731B (en) | Image retrieval method, device, computer equipment and storage medium | |
CN111371757A (en) | Malicious communication detection method and device, computer equipment and storage medium | |
CN107920067B (en) | Intrusion detection method on active object storage system | |
US20180285693A1 (en) | Incremental update of a neighbor graph via an orthogonal transform based indexing | |
Almousa et al. | Characterizing coding style of phishing websites using machine learning techniques | |
US20230273924A1 (en) | Trimming blackhole clusters | |
US20160239264A1 (en) | Re-streaming time series data for historical data analysis | |
CN113810338B (en) | Abnormal service address detection method and device, and computer readable storage medium | |
CN109213972B (en) | Method, device, equipment and computer storage medium for determining document similarity | |
CN113159211B (en) | Method, computing device and computer storage medium for similar image retrieval | |
US20220066988A1 (en) | Hash suppression | |
Sahoo et al. | On the study of GRBF and polynomial kernel based support vector machine in web logs | |
CN110555182A (en) | User portrait determination method and device and computer readable storage medium | |
Koujaku et al. | Structual change point detection for evolutional networks | |
CN117435640A (en) | Method and device for locating similar examples and electronic equipment | |
CN113807087B (en) | Method and device for detecting similarity of website domain names | |
CN108052554A (en) | The method and apparatus that various dimensions expand keyword | |
CN116305297B (en) | Data analysis method and system for distributed database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |