CN112861891B - User behavior abnormality detection method and device - Google Patents

User behavior abnormality detection method and device Download PDF

Info

Publication number
CN112861891B
CN112861891B CN201911178056.5A CN201911178056A CN112861891B CN 112861891 B CN112861891 B CN 112861891B CN 201911178056 A CN201911178056 A CN 201911178056A CN 112861891 B CN112861891 B CN 112861891B
Authority
CN
China
Prior art keywords
behavior
value
behavior feature
feature vectors
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911178056.5A
Other languages
Chinese (zh)
Other versions
CN112861891A (en
Inventor
赵钧
周文红
房硕
张涛
陈盈
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Ltd filed Critical China Telecom Corp Ltd
Priority to CN201911178056.5A priority Critical patent/CN112861891B/en
Publication of CN112861891A publication Critical patent/CN112861891A/en
Application granted granted Critical
Publication of CN112861891B publication Critical patent/CN112861891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Abstract

The invention discloses a method and a device for detecting abnormal user behaviors, and relates to the field of safety. The method comprises the following steps: analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples; calculating a similarity value between the behavior feature vectors of every two samples; each calculated similarity value is used as a weight value between corresponding behavior feature vectors of the PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and determining abnormal behaviors according to PR value sequencing results. The abnormal value detection is converted into the link relation importance ordering problem among the feature vectors, and the link diagrams among users or among the behaviors of the users are constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.

Description

User behavior abnormality detection method and device
Technical Field
The disclosure relates to the field of security, and in particular relates to a method and a device for detecting abnormal user behaviors.
Background
According to Verizon 2018 statistics, 28% of data leakage is caused by internal threats, and how to analyze operation behavior logs of internal users of enterprises to find anomalies and threats becomes a research hotspot in recent years.
Disclosure of Invention
The technical problem to be solved by the present disclosure is to provide a method and an apparatus for detecting abnormal behavior of a user, which can improve the accuracy of abnormal behavior detection.
According to an aspect of the present disclosure, a method for detecting abnormal user behavior is provided, including: analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples; calculating a similarity value between the behavior feature vectors of every two samples; each calculated similarity value is used as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and determining abnormal behaviors according to PR value sequencing results.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is the similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1.
In some embodiments, the behavior feature vectors corresponding to the plurality of samples include behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
In some embodiments, wherein determining the abnormal behavior based on the PR value ordering result comprises: and taking the user behavior corresponding to the behavior feature vector with the PR value smaller than the first threshold value in the PR value sequencing result as the abnormal behavior.
In some embodiments, determining the anomalous behavior from the PR value ordering result comprises: acquiring a behavior feature vector with PR value larger than a second threshold value in the PR value sequencing result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
According to another aspect of the present disclosure, there is also provided a user behavior abnormality detection apparatus including: the behavior feature vector acquisition unit is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples; a similarity value calculation unit configured to calculate a similarity value between behavior feature vectors of each two samples; a PR value ordering unit configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same; and an abnormal behavior recognition unit configured to determine an abnormal behavior according to the PR value ordering result.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is the similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1.
In some embodiments, the behavior feature vectors corresponding to the plurality of samples include behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
According to another aspect of the present disclosure, there is also provided a user behavior abnormality detection apparatus including: a memory; and a processor coupled to the memory, the processor configured to perform the user behavior anomaly detection method as described above based on instructions stored in the memory.
According to another aspect of the disclosure, there is also provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described user behavior anomaly detection method.
Compared with the prior art, the method and the device for detecting the abnormal behavior of the mobile terminal have the advantages that based on the PR algorithm principle, abnormal value detection is converted into the problem of ordering the importance of the link relation between feature vectors, and the link diagram between users or between behaviors of the users is constructed, so that users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Other features of the present disclosure and its advantages will become apparent from the following detailed description of exemplary embodiments of the disclosure, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.
The disclosure may be more clearly understood from the following detailed description taken in conjunction with the accompanying drawings in which:
fig. 1 is a flow diagram of some embodiments of a user behavior anomaly detection method of the present disclosure.
Fig. 2 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
Fig. 3 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
Fig. 4 is a schematic structural diagram of some embodiments of a user behavior anomaly detection device of the present disclosure.
Fig. 5 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure.
Fig. 6 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, numerical expressions and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless it is specifically stated otherwise.
Meanwhile, it should be understood that the sizes of the respective parts shown in the drawings are not drawn in actual scale for convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to one of ordinary skill in the relevant art may not be discussed in detail, but should be considered part of the specification where appropriate.
In all examples shown and discussed herein, any specific values should be construed as merely illustrative, and not a limitation. Thus, other examples of the exemplary embodiments may have different values.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further discussion thereof is necessary in subsequent figures.
For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.
Fig. 1 is a flow diagram of some embodiments of a user behavior anomaly detection method of the present disclosure.
In step 110, the user behavior log is analyzed to obtain behavior feature vectors corresponding to the plurality of samples. The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
For example, user behavior data is collected from a log, feature engineering is performed, and different user behaviors or periodic behaviors of the same user are constructed into feature vectors of m×n-dimensional feature space, where n is a feature number and m is a sample number.
In step 120, a similarity value between the behavioral characteristic vectors of each two samples is calculated.
For example, a cosine similarity value between every two behavior feature vectors, or a distance similarity, or the like is calculated.
At step 130, each calculated similarity value is used as a weight value between corresponding behavior feature vectors of PR (PageRank) algorithm.
At step 140, an iterative calculation of the PR algorithm is performed, generating a PR value ordering result for the behavioral feature vector. The initial value of the PR value of the behavioral characteristic vector of each sample in the PR algorithm is the same, for example, the initial value of the PR value corresponding to the behavioral characteristic vector of each sample is 1/m.
Steps 130 and 140 rank the importance of the link relationships between feature vectors.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is a similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1 respectively.
In step 150, abnormal behavior is determined based on the PR value ordering result.
In some embodiments, the user behavior corresponding to the behavior feature vector with the PR value less than the first threshold is taken as the abnormal behavior.
In some embodiments, a behavior feature vector with a PR value greater than a second threshold may also be obtained from the PR value ordering result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
For example, a user may have previously used browsing and downloading information normally, a month account was stolen or the user is ready to leave, a small number of queries and downloads sensitive data are continually performed daily for the month, the month user behavior may be more regular than before, and the similarity is higher, resulting in the user behavior being ranked more front in the PR value ordering result, and therefore, it is necessary to intercept behavior feature vectors with PR values greater than a second threshold and durations greater than a time threshold. This process can detect low frequency persistent aggression that appears to be free of anomalies.
In the embodiment, based on the PR algorithm principle, the abnormal value detection is converted into the link relation importance ordering problem among feature vectors, and the link diagram among users or among user behaviors is constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Fig. 2 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
At step 210, user behavior log data is collected. The user behavior log data includes authentication system logs, database logs, web access logs, and the like. The behavior log data may be subjected to preprocessing such as cleaning, conversion, association, missing values, and the like.
In step 220, the user behavior log is analyzed to generate behavior feature vectors for a plurality of users.
In step 230, a cosine similarity value between each two behavior feature vectors is calculated.
In some embodiments, cosine similarity values are normalized to [0,1]Wherein 0 represents dissimilarity and 1 represents similarity. For example, for behavior feature vectors A and B, then according to the formulaAnd calculating cosine similarity between the behavior feature vectors A and B.
At step 240, the formula is followedIteratively calculating PR values of the behavioral characteristic vector, wherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of user i t For PR value corresponding to the t-th time behavior feature vector of user j, S_ij is similarity value between the behavior feature vector of user i and the behavior feature vector of user j, m is user number, i is not equal to j, t, i and j are positive integers greater than or equal to 1, and PR value corresponding to the behavior feature vector of each user is initialized to be 1/m. Since the PR algorithm in this embodiment is a full-communication network, the iteration result eventually converges.
At step 250, the behavior feature vectors are ordered from large to small by PR value. The PR value of each user behavior feature vector is a similarity importance value.
In some embodiments, the behavior feature vectors may also be ordered from small to large in PR value.
In step 260, the user corresponding to the behavior feature vector with the PR value smaller than the first threshold value in the PR value sorting result is used as the user with abnormal behavior.
Namely, K values in the sequence are intercepted from back to front as abnormal values, and the user corresponding to the abnormal values is the user with abnormal behaviors.
In the above embodiment, after the behavior feature vectors of a plurality of users are extracted, cosine similarity between the feature vectors is calculated, and the similarity value is used as a weight value in a PR algorithm, PR value ranking calculation is performed on the characteristics, and K feature vectors are intercepted from the ranking according to a threshold value and used as anomalies, so that the method can be suitable for abnormal individual behavior detection in group users.
Fig. 3 is a flow chart illustrating other embodiments of a method for detecting user behavior anomalies according to the present disclosure.
At step 310, user behavior log data is collected.
In step 320, a user behavior log is analyzed to generate behavior feature vectors for multiple points in time for the same user.
In step 330, the cosine similarity between the behavior feature vectors is calculated to obtain the similarity value of behaviors between different time points of the same user.
In step 340, the formula is followedIteratively calculating PR values of the behavioral characteristic vector, wherein PR_i t+1 PR value corresponding to t+1st time behavior feature vector of time point i, PR_j t For PR value corresponding to the t-th time behavior feature vector of the time point j, S_ij is the similarity value between the behavior feature vector of the time point i and the behavior feature vector of the time point j, m is the number of the acquired time points, i is not less than j, t, i and j are positive integers greater than or equal to 1, and PR value of the behavior feature vector corresponding to each time point is initialized to be 1/m.
At step 350, the behavior feature vectors are ordered from large to small by PR value. The PR value of each user behavior feature vector is a similarity importance value.
In step 360, a time point corresponding to the behavior feature vector with the PR value smaller than the first threshold in the PR value sorting result is used as a time point of the abnormal behavior of the user.
In the above embodiment, after the behavior feature vectors of multiple time points of the same user are extracted, cosine similarity between the feature vectors is calculated, and the similarity value is used as a weight value in a PR algorithm, PR value ranking calculation is performed on the characteristics, and K feature vectors are intercepted from the ranking according to a threshold value and used as anomalies, so that the method is applicable to anomaly behavior detection of the same user in a certain time period.
Fig. 4 is a schematic structural diagram of some embodiments of a user behavior anomaly detection device of the present disclosure. The apparatus includes a behavior feature vector acquisition unit 410, a similarity value calculation unit 420, a PR value ordering unit 430, and an abnormal behavior recognition unit 440.
The behavior feature vector acquisition unit 410 is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples.
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to the plurality of users; or the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
For example, user behavior data is collected from a log, feature engineering is performed, and different user behaviors or periodic behaviors of the same user are constructed into feature vectors of m×n-dimensional feature space, where n is a feature number and m is a sample number.
The similarity value calculation unit 420 is configured to calculate a similarity value between the behavior feature vectors of each two samples.
For example, a similarity value between behavior feature vectors of every two samples is calculated based on a cosine similarity algorithm.
The PR value ordering unit 430 is configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of the web page ranking PR algorithm; and performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein the initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same.
In some embodiments, the PR algorithm formula isWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of the sample j, S_ij is a similarity value between the behavior feature vector of the sample i and the behavior feature vector of the sample j, m is the number of samples, i is not equal to j, and t, i and j are positive integers greater than or equal to 1 respectively.
If the behavior feature vector corresponding to the plurality of samples is the behavior feature vector corresponding to the plurality of users, PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of user i t For PR value corresponding to the t-th time behavior feature vector of user j, S_ij is similarity value between the behavior feature vector of user i and the behavior feature vector of user j, m is user number, i is not equal to j, t, i and j are positive integers greater than or equal to 1, and PR value corresponding to the behavior feature vector of each user is initialized to be 1/m.
If the behavior feature vector corresponding to the plurality of samples is the behavior feature vector corresponding to the plurality of time points of the same user, PR_i t+1 PR value corresponding to t+1st time behavior feature vector of time point i, PR_j t For PR value corresponding to the t-th time behavior feature vector of the time point j, S_ij is the similarity value between the behavior feature vector of the time point i and the behavior feature vector of the time point j, m is the number of the acquired time points, i is not less than j, t, i and j are positive integers greater than or equal to 1, and PR value of the behavior feature vector corresponding to each time point is initialized to be 1/m.
The abnormal behavior recognition unit 440 is configured to determine an abnormal behavior according to the PR value ordering result.
In some embodiments, the user behavior corresponding to the behavior feature vector with the PR value less than the first threshold is taken as the abnormal behavior.
In some embodiments, a behavior feature vector with a PR value greater than a second threshold may also be obtained from the PR value ordering result; and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
In the embodiment, based on the PR algorithm principle, the abnormal value detection is converted into the link relation importance ordering problem among feature vectors, and the link diagram among users or among user behaviors is constructed, so that the users with abnormal behaviors are found, and the accuracy of abnormal behavior detection is improved.
Fig. 5 is a schematic structural diagram of other embodiments of a user behavior abnormality detection device of the present disclosure. The device comprises: memory 510 and processor 520, wherein: memory 510 may be a magnetic disk, flash memory, or any other non-volatile storage medium. The memory is used to store instructions in the embodiments corresponding to figures 1-3. Processor 520 is coupled to memory 510 and may be implemented as one or more integrated circuits, such as a microprocessor or microcontroller. The processor 520 is configured to execute instructions stored in the memory.
In some embodiments, as also shown in FIG. 6, the apparatus 600 includes a memory 610 and a processor 620. Processor 620 is coupled to memory 610 through BUS 630. The device 600 may also be coupled to external storage 650 via a storage interface 640 for invoking external data, and may also be coupled to a network or another computer system (not shown) via a network interface 660, not described in detail herein.
In this embodiment, the data instruction is stored by the memory, and then the processor processes the instruction, so that the accuracy of abnormal behavior detection is improved.
In other embodiments, a computer readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the steps of the method of the corresponding embodiments of fig. 1-3. It will be apparent to those skilled in the art that embodiments of the present disclosure may be provided as a method, apparatus, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present disclosure is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Thus far, the present disclosure has been described in detail. In order to avoid obscuring the concepts of the present disclosure, some details known in the art are not described. How to implement the solutions disclosed herein will be fully apparent to those skilled in the art from the above description.
Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the disclosure. The scope of the present disclosure is defined by the appended claims.

Claims (8)

1. A method for detecting user behavior anomalies, comprising:
analyzing the user behavior log to obtain behavior feature vectors corresponding to the plurality of samples;
calculating a similarity value between the behavior feature vectors of every two samples;
each calculated similarity value is used as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm;
performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same, and the PR algorithm formula is as followsWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to t-th time behavior feature vector of sample j, S_ij is similarity value between behavior feature vector of sample i and behavior feature vector of sample j, m is sample number, i is not equal to j, t, i and j are positive integers greater than or equal to 1;
and determining abnormal behaviors according to the PR value sequencing result.
2. The user behavior abnormality detection method according to claim 1, wherein,
the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of users;
or alternatively
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
3. The user behavior abnormality detection method according to claim 1 or 2, wherein determining an abnormal behavior according to the PR value ordering result includes:
and taking the user behaviors corresponding to the behavior feature vectors with PR values smaller than the first threshold in the PR value sequencing result as abnormal behaviors.
4. The user behavior abnormality detection method according to claim 1 or 2, wherein determining an abnormal behavior according to the PR value ordering result includes:
acquiring a behavior feature vector with PR value larger than a second threshold value in the PR value sorting result;
and if the duration time of the behavior feature vector with the PR value larger than the second threshold value is larger than the time threshold value, taking the user behavior corresponding to the behavior feature vector with the PR value larger than the second threshold value as the abnormal behavior.
5. A user behavior abnormality detection apparatus comprising:
the behavior feature vector acquisition unit is configured to analyze the user behavior log and acquire behavior feature vectors corresponding to the plurality of samples;
a similarity value calculation unit configured to calculate a similarity value between behavior feature vectors of each two samples;
a PR value ordering unit configured to use each calculated similarity value as a weight value between corresponding behavior feature vectors of a webpage ranking PR algorithm; performing PR algorithm iterative computation to generate PR value sequencing results of the behavior feature vectors, wherein initial values of PR values of the behavior feature vectors of each sample in the PR algorithm are the same, and the PR algorithm formula is as followsWherein PR_i t+1 PR_j is PR value corresponding to t+1st time behavior feature vector of sample i t For PR value corresponding to the t-th time behavior feature vector of sample j, S_ij is the similarity value between the behavior feature vector of sample i and the behavior feature vector of sample j, m is the number of samples, i is not equal to j, t, i, j isA positive integer of 1 or more;
and an abnormal behavior recognition unit configured to determine an abnormal behavior according to the PR value ordering result.
6. The user behavior abnormality detection device according to claim 5, wherein,
the behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of users;
or alternatively
The behavior feature vectors corresponding to the plurality of samples comprise behavior feature vectors corresponding to a plurality of time points of the same user.
7. A user behavior abnormality detection apparatus comprising:
a memory; and
a processor coupled to the memory, the processor configured to perform the user behavior anomaly detection method of any one of claims 1 to 4 based on instructions stored in the memory.
8. A computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the user behaviour anomaly detection method of any one of claims 1 to 4.
CN201911178056.5A 2019-11-27 2019-11-27 User behavior abnormality detection method and device Active CN112861891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911178056.5A CN112861891B (en) 2019-11-27 2019-11-27 User behavior abnormality detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911178056.5A CN112861891B (en) 2019-11-27 2019-11-27 User behavior abnormality detection method and device

Publications (2)

Publication Number Publication Date
CN112861891A CN112861891A (en) 2021-05-28
CN112861891B true CN112861891B (en) 2023-11-28

Family

ID=75985477

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911178056.5A Active CN112861891B (en) 2019-11-27 2019-11-27 User behavior abnormality detection method and device

Country Status (1)

Country Link
CN (1) CN112861891B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744076B (en) * 2024-02-06 2024-04-16 江苏开博科技有限公司 Bank database system intrusion detection method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity
CN105183784A (en) * 2015-08-14 2015-12-23 天津大学 Content based junk webpage detecting method and detecting apparatus thereof
CN105847302A (en) * 2016-05-31 2016-08-10 北京奇艺世纪科技有限公司 Abnormity detection method and device
CN107992738A (en) * 2017-11-16 2018-05-04 北京奇艺世纪科技有限公司 A kind of account logs in method for detecting abnormality, device and electronic equipment
CN108595655A (en) * 2018-04-27 2018-09-28 福建师范大学 A kind of abnormal user detection method of dialogue-based characteristic similarity fuzzy clustering
CN110297714A (en) * 2019-06-19 2019-10-01 上海冰鉴信息科技有限公司 The method and device of PageRank is obtained based on large-scale graph data collection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10129274B2 (en) * 2016-09-22 2018-11-13 Adobe Systems Incorporated Identifying significant anomalous segments of a metrics dataset
EP3477906B1 (en) * 2017-10-26 2021-03-31 Accenture Global Solutions Limited Systems and methods for identifying and mitigating outlier network activity

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity
CN105183784A (en) * 2015-08-14 2015-12-23 天津大学 Content based junk webpage detecting method and detecting apparatus thereof
CN105847302A (en) * 2016-05-31 2016-08-10 北京奇艺世纪科技有限公司 Abnormity detection method and device
CN107992738A (en) * 2017-11-16 2018-05-04 北京奇艺世纪科技有限公司 A kind of account logs in method for detecting abnormality, device and electronic equipment
CN108595655A (en) * 2018-04-27 2018-09-28 福建师范大学 A kind of abnormal user detection method of dialogue-based characteristic similarity fuzzy clustering
CN110297714A (en) * 2019-06-19 2019-10-01 上海冰鉴信息科技有限公司 The method and device of PageRank is obtained based on large-scale graph data collection

Also Published As

Publication number Publication date
CN112861891A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
US10452725B2 (en) Web page recognizing method and apparatus
CN113255370B (en) Industry type recommendation method, device, equipment and medium based on semantic similarity
CN103577756A (en) Virus detection method and device based on script type judgment
CN112328805A (en) Entity mapping method of vulnerability description information and database table based on NLP
US20190050672A1 (en) INCREMENTAL AUTOMATIC UPDATE OF RANKED NEIGHBOR LISTS BASED ON k-th NEAREST NEIGHBORS
CN112861891B (en) User behavior abnormality detection method and device
WO2017095439A1 (en) Incremental clustering of a data stream via an orthogonal transform based indexing
CN110083731B (en) Image retrieval method, device, computer equipment and storage medium
CN111371757A (en) Malicious communication detection method and device, computer equipment and storage medium
CN107920067B (en) Intrusion detection method on active object storage system
US20180285693A1 (en) Incremental update of a neighbor graph via an orthogonal transform based indexing
Almousa et al. Characterizing coding style of phishing websites using machine learning techniques
US20230273924A1 (en) Trimming blackhole clusters
US20160239264A1 (en) Re-streaming time series data for historical data analysis
CN113810338B (en) Abnormal service address detection method and device, and computer readable storage medium
CN109213972B (en) Method, device, equipment and computer storage medium for determining document similarity
CN113159211B (en) Method, computing device and computer storage medium for similar image retrieval
US20220066988A1 (en) Hash suppression
Sahoo et al. On the study of GRBF and polynomial kernel based support vector machine in web logs
CN110555182A (en) User portrait determination method and device and computer readable storage medium
Koujaku et al. Structual change point detection for evolutional networks
CN117435640A (en) Method and device for locating similar examples and electronic equipment
CN113807087B (en) Method and device for detecting similarity of website domain names
CN108052554A (en) The method and apparatus that various dimensions expand keyword
CN116305297B (en) Data analysis method and system for distributed database

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant