CN111491300B - Risk detection method, apparatus, device and storage medium - Google Patents

Risk detection method, apparatus, device and storage medium Download PDF

Info

Publication number
CN111491300B
CN111491300B CN202010165127.4A CN202010165127A CN111491300B CN 111491300 B CN111491300 B CN 111491300B CN 202010165127 A CN202010165127 A CN 202010165127A CN 111491300 B CN111491300 B CN 111491300B
Authority
CN
China
Prior art keywords
user
login
label
user authentication
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010165127.4A
Other languages
Chinese (zh)
Other versions
CN111491300A (en
Inventor
钱湖海
鲁银冰
徐悦
常嘉岳
周旭莹
钱成
韩凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Hangzhou Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202010165127.4A priority Critical patent/CN111491300B/en
Publication of CN111491300A publication Critical patent/CN111491300A/en
Application granted granted Critical
Publication of CN111491300B publication Critical patent/CN111491300B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/12Detection or prevention of fraud
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic

Abstract

The embodiment of the invention relates to the technical field of communication and discloses a risk detection method, which comprises the following steps: constructing a user authentication graph according to the user login data set; classifying the users in the user authentication graph by adopting a matrix clustering algorithm; determining the labels of the users in the classified user authentication graphs by adopting an adaptive label propagation algorithm; judging whether the user has risk according to the label. The embodiment of the invention also provides a risk detection device, equipment and a storage medium. The risk detection method, the risk detection device, the risk detection equipment and the risk detection storage medium provided by the embodiment of the invention are independent of updating the behavior rule base, can improve the risk detection rate and ensure the service safety.

Description

Risk detection method, apparatus, device and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a risk detection method, apparatus, device, and storage medium.
Background
The mobile internet brings much convenience to people's work and life, and for the business of the mobile internet, the risks faced by the business of the mobile internet become diversified due to the fact that the application scenes are many and complex, for example, abnormal behaviors such as bill brushing, zombie small-size, cat pool number, library collision, zombie network, theft number, interface attack and the like are required to be effectively controlled, so that the safety of the mobile internet industry can be ensured.
However, the inventors found that the prior art has at least the following problems: at present, most of the existing methods for managing and controlling abnormal behaviors rely on a behavior rule base to detect risks, and if the behavior rule base is not updated in time, the risks are possibly missed, so that risks are brought to businesses.
Disclosure of Invention
The embodiment of the invention aims to provide a risk detection method, a risk detection device and a risk detection storage medium, so that detection does not depend on a behavior rule base, the risk detection rate is improved, and the safety of a service is ensured.
In order to solve the above technical problems, an embodiment of the present invention provides a risk detection method, including: constructing a user authentication graph according to the user login data set; classifying the users in the user authentication graph by adopting a matrix clustering algorithm; determining the labels of the users in the classified user authentication graphs by adopting a self-adaptive label propagation algorithm; judging whether the user has risk according to the label.
The embodiment of the invention also provides a risk detection device, which comprises: the authentication diagram construction module is used for constructing a user authentication diagram according to the user login data set; the authentication graph classification module is used for classifying users in the user authentication graph by adopting a matrix clustering algorithm; the label determining module is used for determining labels of users in the classified user authentication graphs by adopting a self-adaptive label propagation algorithm; and the risk judging module is used for judging whether the user has risk according to the label.
The embodiment of the invention also provides a network device, which comprises: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the risk detection method described above.
The embodiment of the invention also provides a computer readable storage medium which stores a computer program, and the computer program realizes the risk detection method when being executed by a processor.
Compared with the prior art, the embodiment of the invention constructs the user authentication graph according to the user login number set, classifies the users in the user authentication graph by adopting a matrix clustering algorithm, determines the labels of the users in the classified user authentication graph by adopting a self-adaptive label propagation algorithm, and judges whether the users have risks according to the labels. The matrix clustering algorithm is used for classifying the users in the user authentication graph, so that the aim of partitioning the data can be fulfilled, and the complexity of the algorithm is reduced; the self-adaptive label propagation algorithm is used for determining the labels of the users in the classified user authentication graphs, so that the effects of more accurate calculation and more accurate calculation results can be achieved, and the users with risks can be rapidly determined; in addition, the method does not depend on updating of the behavior rule base, so that the possibility of missed detection can be reduced, the risk detection rate is improved, and the service safety is ensured.
In addition, the user login data set includes login IDs, login device IDs, and login IPs of a plurality of users. Since the login ID, login device ID, and login IP can be acquired through the application interface, they are easily acquired to form a user login data set; the user login data set consists of login ID, login equipment ID and login IP, so that the algorithm effect can be ensured, the information quantity is not excessive, the problem of excessive calculation quantity is not caused even if more users exist, and the efficiency of the risk detection method is improved.
In addition, constructing a user authentication graph from the user login dataset includes: de-reordering the user login data set to form a user node, wherein the user node comprises an index and an attribute, and the attribute is login ID, login equipment ID or login IP; and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph. The repeated data in the user login data set can be removed through de-reordering, and indexes and attributes are established according to ordering, so that the user authentication graph is convenient to classify through matrix clustering operation subsequently; and the direction of the user node is determined through the index and the attribute, a directed user authentication graph can be constructed, so that the classification of the users in the user authentication graph is effectively carried out.
In addition, the matrix clustering algorithm is adopted to classify the users in the user authentication graph, specifically: classifying users in the user authentication graph into different categories according to a matrix clustering algorithm, wherein the matrix clustering algorithm classifies the users corresponding to the same row in the column where the non-0 value in the calculated value D is located into the same category, and D= (I-alpha A) -1 I is an identity matrix, A is an adjacent matrix in a user authentication graph, and alpha is a preset coefficient. The users in the user authentication graph are classified into different categories through the calculated value D, so that the complexity of matrix calculation can be reduced, and the classification of the users in the user authentication graph is simplified.
In addition, the self-adaptive label propagation algorithm is adopted to determine the labels of the users in the classified user authentication graph, and the self-adaptive label propagation algorithm comprises the following steps: determining positive and negative attributes of a first user node in the classified user authentication graph, and initializing a tag weight of the first user node, wherein the first user node is a part of user nodes in the user authentication graph, and the first user node is a plurality of user nodes; iteratively updating labels of the second user nodes according to the positive and negative attributes and the label weights, wherein the second user nodes are other user nodes except the first user node in the user authentication graph; and obtaining the labels of the users in the user authentication graph according to the iterative result. Firstly, positive and negative attributes of a first user node and initialized tag weights are determined, so that a tag of a second user node is iteratively updated by using a semi-supervised algorithm of a self-adaptive tag propagation algorithm, and the tag of a user in a user authentication graph can be rapidly and accurately acquired.
In addition, iteratively updating the labels of the second user node according to the positive and negative attributes and the label weights, including: by using
Figure BDA0002407167070000031
Updating tag weights, lambda i The label weight of the ith time, i is a positive integer greater than 1; and updating the label of the second user node by adopting the updated label weight and the positive and negative attributes.
In addition, before iteratively updating the labels of the second user node according to the positive and negative attributes and the label weights, the method further comprises: randomly pumping out part of second user nodes in the user authentication graph according to a preset proportion; iteratively updating the label of the second user node according to the positive and negative attributes and the label weight, specifically: iteratively updating the second user nodes remained after the user nodes are pumped out according to the positive and negative attributes and the tag weights; and if the updated result is converged, stopping iteration, otherwise, updating the preset proportion according to a loss formula, and returning to execute the step of randomly pumping out part of the second user nodes in the user authentication graph according to the preset proportion.
Drawings
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings.
Fig. 1 is a schematic flow chart of a risk detection method according to a first embodiment of the present invention;
fig. 2 is a schematic flow chart of a risk detection method according to a second embodiment of the present invention;
fig. 3 is a schematic flow chart of a risk detection method according to a third embodiment of the present invention;
fig. 4 is a schematic block diagram of a risk detection apparatus according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a network device according to a fifth embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. However, those of ordinary skill in the art will understand that in various embodiments of the present invention, numerous technical details have been set forth in order to provide a better understanding of the present application. However, the technical solutions claimed in the present application can be implemented without these technical details and with various changes and modifications based on the following embodiments.
The first embodiment of the invention relates to a risk detection method, which constructs a user authentication graph according to a user login data set; classifying the users in the user authentication graph by adopting a matrix clustering algorithm; determining the labels of the users in the classified user authentication graphs by adopting a self-adaptive label propagation algorithm; judging whether the user has risks according to the labels of the user. At present, most of the risk detection methods for users depend on updating of a behavior rule base, and if the behavior rule base is not updated timely, risk omission is easy to occur. The risk detection method provided by the embodiment of the invention does not depend on the updating of the behavior rule base, so that no missed detection of risks exists, the risk detection rate is improved, and the safety of the business is ensured.
It should be noted that, the execution body of the risk detection method provided by the embodiment of the present invention is a server, where the server may be implemented by an independent server or a server cluster formed by a plurality of servers. The following description will take a server side as an example.
The specific flow of the risk detection method provided by the embodiment of the invention is shown in fig. 1, and the method comprises the following steps:
s101: and constructing a user authentication graph according to the user login data set.
The user login data set refers to a set of various data that can be obtained by the server when the user logs in to the server, and may include, for example, a user login ID, a login IP, a login time, and the like. Alternatively, in order to construct the user authentication map, the kinds of user login data should be greater than or equal to two kinds. It should be noted that the user login data set is login data of a plurality of users, and may be acquired through an application interface.
Specifically, the server side obtains login data of a plurality of users as a user login data set according to the application interface, takes each login data as a node, determines the direction of the node according to a predefined rule, and forms a user authentication graph. The predefined rule may be, for example, that the user login ID points to the login IP, that the user login ID points to the login time, etc., and may be specifically set according to the actual situation, which is not specifically limited herein.
S102: and classifying the users in the user authentication graph by adopting a matrix clustering algorithm.
Since the user authentication graph consists of individual nodes and the nodes have directives, i.e. the user authentication graph can act as a matrix, the user authentication graph can use a matrix clustering algorithm to classify the users in the user authentication graph.
The user authentication graph comprises an adjacency matrix, wherein the adjacency matrix refers to a matrix with adjacent relations among vertexes. The total number of paths between every two points can be calculated by loop calculation of the adjacency matrix, and if the union C is used to represent the sum of all loops in the user authentication graph, the formula can be expressed as follows:
C=I∪A 1 ∪A 2 ∪A 3 ∪A 4 …;
in the above equation, a represents an adjacent matrix, and I represents an identity matrix. If C is equal to C T If the value of (i, j) is not 0, the node i and the node j belong to a class with strong association, so that the users in the user authentication graph can be divided into a plurality of classes according to a matrix clustering algorithm.
It can be understood that the user authentication graphs are classified into the same class of users through a matrix clustering algorithm, and the users have higher similarity.
In a specific example, since the calculation of C is relatively complex, it is conceivable to replace the calculation of C with other calculated values. That is, S102 may specifically be: and classifying the users in the user authentication graph into different categories according to a matrix clustering algorithm, wherein the matrix clustering algorithm classifies the users corresponding to the same row of the column where the non-0 value in the calculated value D is located into the same category.
Specifically, c=i ≡ a 1 ∪A 2 ∪A 3 ∪A 4 … if d=i+a 1 +A 2 +A 3 +A 4 …; since C is equal to the position of 0, D is also equal to 0; since C is a position other than 0 and D is a position corresponding to other than 0, D can be used instead of C for calculation.
Since D does not converge in the above equation, it needs to be modified. Let e=d-AD, i.e. e=i+a 1 +A 2 +A 3 +A 4 …-A 1 +A 2 +A 3 +A 4 … =i= (I-a) D, and thus d= (I-a) -1 . Wherein E is an introduced value, withoutAre provided with specific meanings only for the derivation of the formulas.
But because D= (I-A) -1 Nor always converged, therefore, a preset coefficient α can be introduced, letting αa replace a, then d=i+ (αa) 1 +(αA) 2 +(αA) 3 +(αA) 4 … E=D- αAD, i.e. E=I+ (αA) 1 +(αA) 2 +(αA) 3 +(αA) 4 …-(αA) 1 +(αA) 2 +(αA) 3 +(αA) 4 =i= (I- αa) D, thus yielding d= (I- αa) -1 . Wherein I is an identity matrix, and A is an adjacent matrix in the user authentication graph.
Through calculating the value D, the server can classify the users corresponding to the same row in the column where the non-0 value is located in the D into the same class. The row with the same column where the non-0 value is located refers to that any two rows in the matrix have the same column where the non-0 value is present, for example, the position where the non-0 value is present in the first row is the third column, if the position where the non-0 value is present in the third row is also the third column, the columns where the non-0 value is present in the first row and the third row are the same, that is, the rows are divided into the same class with the same structure.
S103: and determining the labels of the users in the classified user authentication graphs by adopting an adaptive label propagation algorithm.
The basic label propagation algorithm is a semi-supervised learning method based on a graph, the basic idea is to predict label information of unidentified nodes from label information of the identified nodes, and a complete graph model is established by utilizing the relation among samples. Each node label is transmitted to the adjacent node according to the similarity, and in each step of node transmission, each node updates its label according to the label of the adjacent node, and the larger the similarity with the node is, the larger the influence weight of the adjacent node on the label of the adjacent node is, the more consistent the labels of the similar nodes tend to be, and the easier the labels of the similar nodes are transmitted. During the tag propagation process, the tag of the identified data is kept unchanged, so that the tag is transmitted to the unlabeled data. Finally, when the iteration is finished, probability distributions of similar nodes tend to be similar and can be divided into one class.
Since the basic tag propagation algorithm may not have the same results when executed multiple times, the algorithm results are not stable enough, and thus improvements to the basic tag propagation algorithm are needed.
The self-adaptive label propagation algorithm is an improvement on a basic label propagation algorithm, wherein the self-adaptive label propagation algorithm updates the weight by using a preset formula when iterating, so that the weight has a certain loss when iterating each time, and a dropout strategy is adopted when iterating each time, so that the result of the algorithm tends to be stable, and the same result can be obtained even if the self-adaptive label propagation algorithm is executed for a plurality of times. The preset formula and the dropout policy may be set according to actual situations, which is not limited herein.
Specifically, the server may update the label that is not identified in the classified user authentication graph according to the label and the positive and negative samples that are identified in advance, according to the adaptive label algorithm, and then determine the label of the user in the classified user authentication graph according to the final iteration result. The label and the positive and negative samples marked in advance can be marked and determined manually by a user, the positive sample refers to a normal user without risk, the negative sample refers to an abnormal user with risk, and the information corresponding to the label can be normal and abnormal, for example.
S104: judging whether the user has risk according to the label.
It can be understood that, because the information corresponding to the tag can reflect whether the user is an abnormal user, the server can judge whether the user has risk according to the information corresponding to the tag, so that further protective measures are carried out according to the judging result, and the service safety is ensured. For example, when the information corresponding to the tag is abnormal, it is determined that the user corresponding to the tag is at risk.
Compared with the prior art, the risk detection method provided by the embodiment of the invention constructs the user authentication graph according to the user login number set, classifies the users in the user authentication graph by adopting a matrix clustering algorithm, determines the labels of the users in the classified user authentication graph by adopting a self-adaptive label propagation algorithm, and judges whether the users have risks according to the labels. The matrix clustering algorithm is used for classifying the users in the user authentication graph, so that the aim of partitioning the data can be fulfilled, and the complexity of the algorithm is reduced; the self-adaptive label propagation algorithm is used for determining the labels of the users in the classified user authentication graphs, so that the effects of more accurate calculation and more accurate calculation results can be achieved, and the users with risks can be rapidly determined; in addition, the method does not depend on updating of the behavior rule base, so that the possibility of missed detection is reduced, the risk detection rate is improved, and the service safety is ensured.
A second embodiment of the present invention relates to a risk detection method. The second embodiment is substantially the same as the first embodiment, and differs mainly in that: in the second embodiment of the present invention, the user login data set includes login IDs, login device IDs, and login IPs of a plurality of users, and in S101 in the first embodiment, a user authentication map is constructed from the user login data set, and in this embodiment, the method may specifically include: de-reordering the user login data set to form a user node, wherein the user node comprises an index and an attribute, and the attribute is login ID, login equipment ID or login IP; and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph.
The specific flow of the risk detection method provided by the embodiment of the invention is shown in fig. 2, and specifically comprises the following steps:
s201: the user login data set is reordered to form a user node, wherein the user node comprises an index and an attribute, and the attribute is login ID, login equipment ID or login IP.
S202: and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph.
S203: and classifying the users in the user authentication graph by adopting a matrix clustering algorithm.
S204: and determining the labels of the users in the classified user authentication graphs by adopting an adaptive label algorithm.
S205: judging whether the user has risk according to the label.
The S203 to S205 are the same as S102 to S104 in the first embodiment, and specific reference may be made to the first embodiment, and details thereof are not repeated here.
For S201-S202, specifically, because the service side may include duplicate users in the user login data set acquired from the application interface of the service, duplicate user login data needs to be removed; and then sorting the de-duplicated user login data sets to form user nodes, marking an index for each user node according to the sorting, and enabling the user nodes to comprise indexes and attributes, wherein the attributes are login IDs, login equipment IDs or login IPs when the user logs in, namely, the user login data sets consist of three parts of the login IDs, the login equipment IDs and the login IPs of the user, for example, as shown in the following table 1:
TABLE 1
User login data collection English abbreviations Sample example
User ID login_id 18800000000
Login device ID phone_id B0047AB874DC30C8CF0CF46D55D3BB8D
Logging in IP ip 192.168.0.1
Because the information in the table 1 is the information collected by the application interface, the information is easy to obtain and the loss rate is extremely low, the integrity of the data can be ensured, the information is more accurate, no noise exists, and all the collected data can meet the above specifications.
Alternatively, the index of the user node may be as follows:
1,login_id 1
i,login_id i
i+j-1,phone_id i+j -1
i+j+k-1,phone_id i+j+k-1
i+j+k,ip i+j+k
N,ip k
wherein i is more than or equal to 1 and less than or equal to N-k+j, j is more than or equal to 2 and less than or equal to k-j, k-j is more than or equal to k and less than or equal to N, N is the number of rows of the column, the left is an index, and the right is an attribute.
It should be noted that the above indexes are merely examples, and may be specifically set according to actual needs, which are not limited herein.
The server may determine a piece of user login data according to the index in the user node, and then determine the direction of each login data in the piece of user login data according to the direction of each predefined attribute, for example, the direction of each predefined attribute is to make the login ID point to the login device and the login IP respectively, if the login ID is the login_id i The login device ID is phone_id i+j-1 The login IP is IP i+k The direction is: login_id i →phone_id i+j-1 ,login_id i →ip i+k And the server repeatedly traverses all the user nodes to form a user authentication graph.
Compared with the prior art, the risk detection method provided by the embodiment of the invention has the advantages that the login ID, the login equipment ID and the login IP of the user are collected as the user login data set, a large amount of information is not required to be collected, a user authentication graph is constructed according to the user login data to calculate, and even if the number of the included users is large, the problem of excessive calculation amount is not caused; and the user login data set is easy to obtain, the loss rate is low, and the data integrity can be ensured.
A third embodiment of the present invention relates to a risk detection method. The third embodiment is substantially the same as the first embodiment, and differs mainly in that: in S103 in the first embodiment, that is, determining the label of the user in the classified user authentication graph by adopting the adaptive label propagation algorithm, the embodiment of the present invention may specifically include: determining positive and negative attributes of a first user node in the classified user authentication graph, and initializing a tag weight of the first user node, wherein the first user node is part of user nodes in the user authentication graph; iteratively updating labels of second user nodes according to the positive and negative attributes and the label weights of the first user nodes, wherein the second user nodes are other user nodes except the first user nodes in the user authentication graph; and obtaining the labels of the users in the user authentication graph according to the iterative result.
The specific flow of the risk detection method provided by the embodiment of the invention is shown in fig. 3, and specifically comprises the following steps:
s301: and constructing a user authentication graph according to the user login data set.
S302: and classifying the users in the user authentication graph by adopting a matrix clustering algorithm.
S303: and determining positive and negative attributes of the first user node in the classified user authentication graph, and initializing the tag weight of the first user node, wherein the first user node is a part of user nodes in the user authentication graph, and the first user node is a plurality of first user nodes.
S304: and iteratively updating the labels of the second user nodes according to the positive and negative attributes and the label weights, wherein the second user nodes are the rest user nodes except the first user node in the user authentication graph.
S305: and obtaining the labels of the users in the user authentication graph according to the iterative result.
S306: judging whether the user has risk according to the label.
Wherein S301, S302, and S306 are the same as S101, S102, and S104 in the first embodiment, and specific reference may be made to the description of the first embodiment, which is not repeated herein.
For S303-S305, specifically, the positive and negative attributes of the first user node represent the attributes of the positive and negative samples of the user node, that is, the positive attribute corresponds to the positive sample (normal user), the negative attribute corresponds to the negative sample (abnormal user), and the specific expression may be set according to the actual needs, which is not limited herein. Optionally, the positive and negative attributes of the first user node may be derived from manual pre-identification, and the server determines the positive and negative attributes of the first user node in the user authentication graph according to the identification. Alternatively, the number of the first user nodes may be set according to actual needs, which is not limited herein. It should be appreciated that the positive and negative attributes of the first user node are the labels of the first user node. Since the label of the second user node needs to be updated according to the label weight of the first user node, the server needs to initialize the label weight of the first user node, and optionally, the label weight of the first user node may be initialized to λ=1.0.
After determining the positive and negative attributes of the first user node and initializing the tag weight of the first user node, the tag of the second user node can be calculated by adopting a self-adaptive tag propagation algorithm, specifically, the tag of the second user node is iteratively updated according to the positive and negative attributes and the tag weight of the first user node, and when the iterative result converges, the tags of all users in the user authentication graph can be determined.
Alternatively, when an adaptive tag propagation algorithm is employed, it may be employed that
Figure BDA0002407167070000111
Updating the tag weight to lose a portion of the tag weight, wherein λ i And updating the label of the second user node by using the updated label weight and the positive and negative attributes when iterating for the label weight of the ith time.
Optionally, before iteratively updating the label of the second user node according to the positive and negative attribute and the label weight of the first user node, part of the second user node in the user authentication graph may be randomly drawn according to a preset proportion, and then iteratively updating the label of the second user node according to the positive and negative attribute and the label weight of the first user node is as follows:iteratively updating the second user nodes remained after the user nodes are pumped out according to the positive and negative attributes and the tag weights of the first user nodes; and if the updated result is converged, stopping iteration, otherwise, updating the preset proportion according to a loss formula, and returning to the step of executing the step of randomly pumping out part of the second user nodes in the user authentication graph according to the preset proportion. The preset proportion is updated according to a loss formula during iteration, namely, fewer second user nodes are extracted from each iteration. Preferably, the preset ratio is 20%, and the loss formula is: d, d i =δd i-1 Preferably, δ=0.9.
Specifically, after determining the positive and negative attributes of the first user node and initializing the label weight, the server extracts part of the second user nodes in the user authentication graph according to a preset proportion, then iteratively updates the labels of the second user nodes which remain after the user nodes are extracted, and if the updated results are converged, stopping iteration; and if the updated result is not converged, circularly executing the steps of updating the preset proportion according to a loss formula, randomly pumping out part of the second user nodes in the user authentication graph according to the preset proportion, and iteratively updating the rest second user nodes after the user nodes are pumped out according to the positive and negative attributes and the tag weights of the first user nodes until the updated result is converged.
During iteration, the server can judge according to the calculation result, and if the result of the new iteration is the same as the label with the prescribed proportion of the result of the previous iteration, the iteration can be stopped. Alternatively, the rule proportion is 95%.
Compared with the prior art, the risk detection method provided by the embodiment of the invention adopts the self-adaptive label propagation algorithm to determine the labels of the users in the classified user authentication graphs, so that the labels of the users can be rapidly determined, the result of the algorithm tends to be stable, and the accuracy of risk detection is improved.
The above steps of the methods are divided, for clarity of description, and may be combined into one step or split into multiple steps when implemented, so long as they contain the same logic relationship, and they are all within the protection scope of this patent; it is within the scope of this patent to add insignificant modifications to the algorithm or flow or introduce insignificant designs, but not to alter the core design of its algorithm and flow.
A fourth embodiment of the present invention relates to a risk detection apparatus 400, as shown in fig. 4, including: an authentication graph construction module 401, an authentication graph classification module 402, a tag determination module 403, and a risk determination module 404.
An authentication graph construction module 401, configured to construct a user authentication graph according to a user login data set;
an authentication graph classification module 402, configured to classify users in the user authentication graph by using a matrix clustering algorithm;
a tag determining module 403, configured to determine a tag of a user in the user authentication graph after classification by using an adaptive tag propagation algorithm;
and the risk judging module 404 is configured to judge whether the user has a risk according to the tag.
Further, the user login data set includes login IDs, login device IDs, and login IPs of a plurality of users.
Further, the authentication graph construction module 401 is further configured to:
de-reordering the user login data set to form a user node, wherein the user node comprises an index and an attribute, and the attribute is the login ID, the login equipment ID or the login IP;
and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph.
Further, the authentication graph classification module 402 is further configured to:
classifying the users in the user authentication graph according to a matrix clustering algorithm, wherein the matrix clustering algorithm classifies the users corresponding to the same row in the column where the non-0 value in the calculated value D is located into the same class, and the D= (I-alpha A) -1 I is an identity matrix, A is an adjacent matrix in the user authentication graph, and alpha is a preset coefficient.
Further, the tag determination module 403 is further configured to:
determining positive and negative attributes of a first user node in the classified user authentication graph, and initializing tag weights of the first user node, wherein the first user node is part of user nodes in the user authentication graph, and the first user node is a plurality of user nodes;
iteratively updating labels of second user nodes according to the positive and negative attributes and the label weights, wherein the second user nodes are other user nodes except the first user node in the user authentication graph;
and obtaining the labels of the users in the user authentication graph according to the iterative result.
Further, the tag determination module 403 is further configured to:
by using
Figure BDA0002407167070000141
Updating the tag weight, λ i The label weight of the ith time is the label weight, wherein i is a positive integer greater than 1;
and updating the label of the second user node by adopting the updated label weight and the positive and negative attributes.
Further, the tag determination module 403 is further configured to:
randomly pumping out part of the second user nodes in the user authentication graph according to a preset proportion;
the step of iteratively updating the label of the second user node according to the positive and negative attributes and the label weight is specifically as follows:
iteratively updating the second user nodes remained after the user nodes are pumped out according to the positive and negative attributes and the tag weights;
and if the updated result is converged, stopping iteration, otherwise, updating the preset proportion according to a loss formula, and returning to execute the step of randomly pumping out part of the second user nodes in the user authentication graph according to the preset proportion.
It is to be noted that this embodiment is an example of a device corresponding to the first embodiment, and can be implemented in cooperation with the first embodiment. The related technical details mentioned in the first embodiment are still valid in this embodiment, and in order to reduce repetition, a detailed description is omitted here. Accordingly, the related art details mentioned in the present embodiment can also be applied to the first embodiment.
It should be noted that each module in this embodiment is a logic module, and in practical application, one logic unit may be one physical unit, or may be a part of one physical unit, or may be implemented by a combination of multiple physical units. In addition, in order to highlight the innovative part of the present invention, units that are not so close to solving the technical problem presented by the present invention are not introduced in the present embodiment, but this does not indicate that other units are not present in the present embodiment.
A fourth embodiment of the invention relates to a network device, as shown in fig. 5, comprising at least one processor 501; and a memory 502 communicatively coupled to the at least one processor 501; the memory 502 stores instructions executable by the at least one processor 501, and the instructions are executed by the at least one processor 501 to enable the at least one processor 501 to perform the risk detection method described above.
Where the memory 502 and the processor 501 are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting the various circuits of the one or more processors 501 and the memory 502. The bus may also connect various other circuits such as peripherals, voltage regulators, and power management circuits, which are well known in the art, and therefore, will not be described any further herein. The bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or may be a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor 501 is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor 501.
The processor 501 is responsible for managing the bus and general processing and may also provide various functions including timing, peripheral interfaces, voltage regulation, power management, and other control functions. And memory 502 may be used to store data used by processor 501 in performing operations.
A fifth embodiment of the present invention relates to a computer-readable storage medium storing a computer program. The computer program implements the above-described method embodiments when executed by a processor.
That is, it will be understood by those skilled in the art that implementing all or part of the steps in the methods of the embodiments described above may be accomplished by a program stored in a storage medium, including several instructions for causing a device (which may be a single-chip microcomputer, a chip or the like) or a processor (processor) to perform all or part of the steps of the methods of the embodiments described herein. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
It will be understood by those of ordinary skill in the art that the foregoing embodiments are specific examples of carrying out the invention and that various changes in form and details may be made therein without departing from the spirit and scope of the invention.

Claims (8)

1. A risk detection method, comprising:
constructing a user authentication graph according to the user login data set;
classifying the users in the user authentication graph by adopting a matrix clustering algorithm;
determining the labels of the users in the classified user authentication graphs by adopting an adaptive label propagation algorithm;
judging whether the user has risk according to the label;
wherein the user login data set comprises login IDs, login equipment IDs and login IPs of a plurality of users; the construction of the user authentication graph according to the user login data set comprises the following steps:
de-reordering the user login data set to form a user node, wherein the user node comprises an index and an attribute, and the attribute is the login ID, the login equipment ID or the login IP;
and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph.
2. The risk detection method according to claim 1, wherein the classifying the users in the user authentication graph by using a matrix clustering algorithm is specifically:
classifying users in the user authentication graph into different categories according to a matrix clustering algorithm, wherein the matrix clustering algorithm classifies the users corresponding to the same rows of the same columns where the non-0 values in the calculated value D are located into the same category, and the D= (I-alpha A) -1 I is an identity matrix, A is an adjacent matrix in the user authentication graph, and alpha is a preset coefficient.
3. The risk detection method according to claim 2, wherein the determining the label of the user in the classified user authentication graph by using an adaptive label propagation algorithm includes:
determining positive and negative attributes of a first user node in the classified user authentication graph, and initializing tag weights of the first user node, wherein the first user node is part of user nodes in the user authentication graph, and the first user node is a plurality of user nodes;
iteratively updating labels of second user nodes according to the positive and negative attributes and the label weights, wherein the second user nodes are other user nodes except the first user node in the user authentication graph;
and obtaining the labels of the users in the user authentication graph according to the iterative result.
4. A risk detection method according to claim 3, wherein said iteratively updating the labels of the second user nodes according to the positive and negative attributes and the label weights comprises:
by using
Figure FDA0004053621130000021
Updating the tag weight, λ i The label weight of the ith time is the label weight, wherein i is a positive integer greater than 1;
and updating the label of the second user node by adopting the updated label weight and the positive and negative attributes.
5. The risk detection method of claim 4, further comprising, prior to said iteratively updating labels of second user nodes based on said positive and negative attributes and said label weights:
randomly pumping out part of the second user nodes in the user authentication graph according to a preset proportion;
the step of iteratively updating the label of the second user node according to the positive and negative attributes and the label weight is specifically as follows:
iteratively updating the second user nodes remained after the user nodes are pumped out according to the positive and negative attributes and the tag weights;
and if the updated result is converged, stopping iteration, otherwise, updating the preset proportion according to a loss formula, and returning to the step of executing the step of randomly pumping out part of the second user nodes in the user authentication graph according to the preset proportion.
6. A risk detection apparatus, comprising:
the authentication diagram construction module is used for constructing a user authentication diagram according to the user login data set;
the authentication graph classification module is used for classifying users in the user authentication graph by adopting a matrix clustering algorithm;
the label determining module is used for determining the labels of the users in the classified user authentication graphs by adopting an adaptive label propagation algorithm;
the risk judging module is used for judging whether the user has risk according to the label;
wherein the user login data set comprises login IDs, login equipment IDs and login IPs of a plurality of users; the construction of the user authentication graph according to the user login data set comprises the following steps:
de-reordering the user login data set to form a user node, wherein the user node comprises an index and an attribute, and the attribute is the login ID, the login equipment ID or the login IP;
and determining the direction of the user node according to the index and the attribute in each user node to form a user authentication graph.
7. A network device, comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the risk detection method of any one of claims 1 to 5.
8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the risk detection method according to any one of claims 1 to 5.
CN202010165127.4A 2020-03-11 2020-03-11 Risk detection method, apparatus, device and storage medium Active CN111491300B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010165127.4A CN111491300B (en) 2020-03-11 2020-03-11 Risk detection method, apparatus, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010165127.4A CN111491300B (en) 2020-03-11 2020-03-11 Risk detection method, apparatus, device and storage medium

Publications (2)

Publication Number Publication Date
CN111491300A CN111491300A (en) 2020-08-04
CN111491300B true CN111491300B (en) 2023-04-28

Family

ID=71812443

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010165127.4A Active CN111491300B (en) 2020-03-11 2020-03-11 Risk detection method, apparatus, device and storage medium

Country Status (1)

Country Link
CN (1) CN111491300B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113536288B (en) * 2021-06-23 2023-10-27 上海派拉软件股份有限公司 Data authentication method, device, authentication equipment and storage medium
CN113420941A (en) * 2021-07-16 2021-09-21 湖南快乐阳光互动娱乐传媒有限公司 Risk prediction method and device for user behavior

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301286A (en) * 2013-07-15 2015-01-21 中国移动通信集团黑龙江有限公司 User login authentication method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108062360A (en) * 2017-11-29 2018-05-22 广东技术师范学院 A kind of method, system and device of large-scale complex community structure detection
CN109784636A (en) * 2018-12-13 2019-05-21 中国平安财产保险股份有限公司 Fraudulent user recognition methods, device, computer equipment and storage medium
CN110428139A (en) * 2019-07-05 2019-11-08 阿里巴巴集团控股有限公司 The information forecasting method and device propagated based on label

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104301286A (en) * 2013-07-15 2015-01-21 中国移动通信集团黑龙江有限公司 User login authentication method and device

Also Published As

Publication number Publication date
CN111491300A (en) 2020-08-04

Similar Documents

Publication Publication Date Title
US9811527B1 (en) Methods and apparatus for database migration
US20150356091A1 (en) Method and system for identifying microblog user identity
CN108629413A (en) Neural network model training, trading activity Risk Identification Method and device
CN106844407B (en) Tag network generation method and system based on data set correlation
US20130218620A1 (en) Method and system for skill extraction, analysis and recommendation in competency management
WO2020114108A1 (en) Clustering result interpretation method and device
CN111491300B (en) Risk detection method, apparatus, device and storage medium
CN104851025A (en) Case-reasoning-based personalized recommendation method for E-commerce website commodity
CN111444956B (en) Low-load information prediction method, device, computer system and readable storage medium
CN115796310A (en) Information recommendation method, information recommendation device, information recommendation model training device, information recommendation equipment and storage medium
Yan et al. A clustering algorithm for multi-modal heterogeneous big data with abnormal data
CN111475158A (en) Sub-domain dividing method and device, electronic equipment and computer readable storage medium
US8756093B2 (en) Method of monitoring a combined workflow with rejection determination function, device and recording medium therefor
CN112215629A (en) Multi-target advertisement generation system and method based on construction countermeasure sample
CN114143035A (en) Attack resisting method, system, equipment and medium for knowledge graph recommendation system
CN116361759B (en) Intelligent compliance control method based on quantitative authority guidance
CN115186650B (en) Data detection method and related device
CN111932302A (en) Method, device, equipment and system for determining number of service sites in area
CN116304518A (en) Heterogeneous graph convolution neural network model construction method and system for information recommendation
CN113590912B (en) Cross-social network alignment method integrating relative position and absolute degree distribution of nodes
US11321375B2 (en) Text object management system
CN111324641B (en) Personnel estimation method and device, computer-readable storage medium and terminal equipment
CN115454473A (en) Data processing method based on deep learning vulnerability decision and information security system
CN112507912A (en) Method and device for identifying illegal picture
CN111400413A (en) Method and system for determining category of knowledge points in knowledge base

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant