CN110689084B - Abnormal user identification method and device - Google Patents

Abnormal user identification method and device Download PDF

Info

Publication number
CN110689084B
CN110689084B CN201910943381.XA CN201910943381A CN110689084B CN 110689084 B CN110689084 B CN 110689084B CN 201910943381 A CN201910943381 A CN 201910943381A CN 110689084 B CN110689084 B CN 110689084B
Authority
CN
China
Prior art keywords
user
network
information
abnormal
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910943381.XA
Other languages
Chinese (zh)
Other versions
CN110689084A (en
Inventor
李犇
张�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910943381.XA priority Critical patent/CN110689084B/en
Publication of CN110689084A publication Critical patent/CN110689084A/en
Application granted granted Critical
Publication of CN110689084B publication Critical patent/CN110689084B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application provides an abnormal user identification method and device, wherein the method comprises the following steps: acquiring service information of a target user and a plurality of historical users; determining an associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users; determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network; determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model; and if the probability value is greater than or equal to a preset probability value, determining that the target user is an abnormal user. Compared with the prior art, the method and the device have the advantages that the abnormal user can be identified before the associated user generates the abnormal behavior, the predictability is high, and the accuracy of identifying the abnormal user can be improved.

Description

Abnormal user identification method and device
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for identifying an abnormal user.
Background
At present, in work and life, a large number of services which need to be applied exist, and a service provider can judge whether service needs to be provided for a user applying for the services according to service information provided by the user. In this process, a user is often required to provide information, such as contacts, the information provided by the user is associated with other users, and a relationship network is formed among the users. In general, if there are many abnormal users in the relationship network to which the applicant belongs, the applicant of the service also has a high probability of being an abnormal user.
In the existing abnormal user identification method, whether the associated user is an abnormal user is determined only by means of behavior information of the associated user, and then whether the applicant is an abnormal user is determined.
Disclosure of Invention
In view of the above, an object of the present application is to provide an abnormal user identification method and apparatus, which determine whether an associated community network to which a target user belongs is an abnormal network or not according to a network structure feature vector and an abnormal information feature vector of the associated community network to which the target user belongs, and further determine whether the target user is an abnormal user or not, so that the abnormal user can be identified before the associated user generates an abnormal behavior, and the method and apparatus have high predictability, and further can improve the accuracy of identifying the abnormal user.
The embodiment of the application provides an abnormal user identification method, which comprises the following steps:
acquiring service information of a target user and a plurality of historical users;
determining an associated community network to which the target user belongs according to service information submitted by the target user and a plurality of historical users, wherein the associated community network comprises a plurality of user nodes and a plurality of service nodes, each service node comprises one piece of service information, each user node is connected with the service node associated with the user node through a connecting line, and the connecting line stores the associated information between the corresponding user node and the service node;
determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network;
determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model;
and if the probability value is greater than or equal to a preset probability value, determining that the target user is an abnormal user.
In a possible implementation manner, the determining, according to the service information submitted by the target user and the plurality of historical users, the associated community network to which the target user belongs includes:
generating a user heterogeneous information network based on the service information submitted by the target user and the plurality of historical users and the relationship between each service information and the target user and the plurality of historical users, wherein the user heterogeneous information network comprises a plurality of user nodes and a plurality of service nodes, each user node is connected with the service node associated with the user node through a connecting line, and the user heterogeneous information network is divided into a plurality of sub-information networks according to the connection relationship between the user nodes;
according to the distance between every two adjacent user nodes in the user heterogeneous information network and the number of the user nodes in each sub information network, dividing the sub information network with the number of the user nodes larger than a preset threshold value into a plurality of sub information networks with the number of the user nodes smaller than or equal to the preset threshold value;
and determining that the sub information network to which the user node corresponding to the target user belongs is the associated community network to which the target user belongs.
In a possible implementation manner, the determining, based on the associated community network, a network structure feature vector and an abnormal information feature vector of the associated community network to which the target user belongs includes:
determining abnormal service nodes in the associated community network based on each user node, the service information in each service node and the associated information in each connecting line, and determining the abnormal information characteristic vector according to the service information in the abnormal service nodes;
and embedding the associated community network to obtain the network structure characteristic vector.
In one possible embodiment, the method further comprises the step of training the anomaly network recognition model:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
In a possible implementation manner, the determining the status label of the associated community network sample based on the service status of the user corresponding to each user node in the associated community network sample includes:
determining the average state weight of the associated community network sample based on the state weight corresponding to each service state;
and determining the state label of the associated community network sample based on the average state weight.
An embodiment of the present application further provides an abnormal user identification apparatus, where the apparatus includes:
the acquisition module is used for acquiring the service information of a target user and a plurality of historical users;
the first determining module is used for determining an associated community network to which the target user belongs according to service information submitted by the target user and a plurality of historical users, wherein the associated community network comprises a plurality of user nodes and a plurality of service nodes, each service node comprises one item of service information, each user node is connected with the service node associated with the user node through a connecting line, and the connecting line stores the associated information between the corresponding user node and the service node;
the second determination module is used for determining a network structure characteristic vector and an abnormal information characteristic vector of the associated community network to which the target user belongs based on the associated community network;
the third determining module is used for determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and the trained abnormal network identification model;
and the fourth determining module is used for determining that the target user is an abnormal user when the probability value is greater than or equal to a preset probability value.
In one possible implementation, the first determining module includes:
a generating unit, configured to generate a user heterogeneous information network based on service information submitted by the target user and the multiple historical users and a relationship between each piece of service information and the target user and the multiple historical users, where the user heterogeneous information network includes multiple user nodes and multiple service nodes, each user node is connected to a service node associated with the user node through a connection line, and the user heterogeneous information network is divided into multiple sub-information networks according to the connection relationship between the user nodes;
the segmentation unit is used for segmenting the sub-information network with the user node number larger than a preset threshold into a plurality of sub-information networks with the user node number smaller than or equal to the preset threshold according to the distance between every two adjacent user nodes in the user heterogeneous information network and the user node number in each sub-information network;
and the determining unit is used for determining that the sub information network to which the user node corresponding to the target user belongs is the associated community network to which the target user belongs.
In a possible implementation, the apparatus further includes a model training module, and the model training module is configured to:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
In a possible implementation manner, the second determining module is specifically configured to:
determining abnormal service nodes in the associated community network based on each user node, the service information in each service node and the associated information in each connecting line, and determining the abnormal information characteristic vector according to the service information in the abnormal service nodes;
and embedding the associated community network to obtain the network structure characteristic vector.
In a possible implementation manner, when determining the state label of the associated community network sample based on the service state of the user corresponding to each user node in the associated community network sample, the model training module is specifically configured to:
determining the average state weight of the associated community network sample based on the state weight corresponding to each service state;
and determining the state label of the associated community network sample based on the average state weight.
An embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the method for identifying an abnormal user as described above.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the abnormal user identification method as described above are performed.
The abnormal user identification method and the abnormal user identification device provided by the embodiment of the application acquire service information of a target user and a plurality of historical users; determining an associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users; determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network; determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model; and if the probability value is greater than or equal to a preset probability value, determining that the target user is an abnormal user. Compared with the prior art, the method and the device have the advantages that the abnormal user can be identified before the associated user generates the abnormal behavior, the predictability is high, and the accuracy of identifying the abnormal user can be improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating an abnormal user identification method provided in an embodiment of the present application;
FIG. 2 is a flow chart illustrating another abnormal user identification method provided by an embodiment of the present application;
fig. 3 is a schematic structural diagram of an abnormal user identification apparatus provided in an embodiment of the present application;
fig. 4 is a second schematic structural diagram of an abnormal user identification apparatus according to an embodiment of the present application;
fig. 5 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. Every other embodiment that can be obtained by a person skilled in the art without making creative efforts based on the embodiments of the present application falls within the protection scope of the present application.
Research shows that in the existing abnormal user identification method, whether the associated user is an abnormal user is determined only by means of behavior information of the associated user, and then whether the applicant is an abnormal user is judged.
Based on this, the embodiment of the application provides an abnormal user identification method, which judges whether the associated community network to which the target user belongs is an abnormal network or not according to the network structure feature vector and the abnormal information feature vector of the associated community network to which the target user belongs, and further judges whether the target user is an abnormal user or not, so that the abnormal user can be identified before the associated user generates abnormal behaviors, the predictability is high, and the accuracy of identifying the abnormal user can be improved.
Referring to fig. 1, fig. 1 is a flowchart illustrating an abnormal user identification method according to an embodiment of the present disclosure. As shown in fig. 1, the abnormal user identification method provided in the embodiment of the present application includes:
s101, acquiring service information of a target user and a plurality of historical users.
In this step, the user needs to submit service information when transacting the service, so that the service provider can determine whether the user is an abnormal user according to the service information, and the service information may include personal credit information, contact information, demographic information, working information, device information, service status information, and the like of the user.
S102, determining the associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users.
The associated community network comprises a plurality of user nodes and a plurality of service nodes, each service node comprises service information, each user node is connected with the service node associated with the user node through a connecting line, and the connecting line stores associated information between the corresponding user node and the corresponding service node.
In the step, a user heterogeneous information network can be established according to service information submitted by a target user and a plurality of historical users, the user heterogeneous information network comprises a plurality of user nodes and a plurality of service nodes, each user node represents a user, each service node can comprise an address node, a telephone node, a company node, an equipment node and the like, each service node stores service information submitted by the user when the user transacts the service or service information generated by the user during the service transacting, the user nodes and the service nodes are connected through connecting lines, relevant attributes between the user and the service information are stored in the connecting lines, for example, a contact person filled by the user A when the user transacts the service is B, a contact person telephone is a, and the user node representing the user A is connected with the service node storing the user A in the user heterogeneous information network through a connecting line L, the relation type stored in L is contact person, the association attribute includes contact person name, the service node stored with a can also be connected with user node representing user B through connecting line M, the information stored in M can be user telephone.
Here, the generated user heterogeneous information network includes a plurality of large-scale connected networks and small-scale connected networks, in the connected networks, the user nodes are connected through the service nodes and the connecting lines, and in order to make the connection inside each connected network more real and tight, the large-scale connected network may be split to become a plurality of small-scale connected networks, and the connected network including the user node corresponding to the target user is the related community network.
S103, determining a network structure characteristic vector and an abnormal information characteristic vector of the associated community network to which the target user belongs based on the associated community network.
In this step, it is possible to determine abnormal business information stored in the associated community network, such as business information with the same contact name and the same company name but different contact names submitted by different users, and use vector representation of the abnormal business information to obtain an abnormal information feature vector, and then use vector representation of the structural features of the associated community network to obtain a network structural feature vector.
It should be noted that the associated community networks to which the abnormal users belong generally have similar structural features, and after the service behavior of the user in the associated community network to which the target user belongs is considered, and the structural features of the associated community networks are also considered, the abnormal users can be identified before the abnormal service behavior occurs to the user, so that the method has better predictability.
S104, determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and the trained abnormal network identification model.
Specifically, the network structure feature vector and the abnormal information feature vector may be input into an abnormal network identification model, and the abnormal network identification model outputs a probability value that the associated community network is an abnormal network after calculation.
And S105, if the probability value is larger than or equal to a preset probability value, determining that the target user is an abnormal user.
Referring to fig. 2, fig. 2 is a flowchart of an abnormal user identification method according to another embodiment of the present application. As shown in fig. 2, the abnormal user identification method provided in the embodiment of the present application includes:
s201, acquiring service information of a target user and a plurality of historical users.
S202, determining the associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users.
S203, determining abnormal service nodes in the associated community network based on each user node, the service information in each service node and the associated information in each connecting line, and determining the abnormal information characteristic vector according to the service information in the abnormal service nodes.
In this step, the service node corresponding to the service information having the logic error with the other service information may be used as an abnormal service node, and the abnormal information feature vector may be obtained by representing the service information in a vector manner.
And S204, embedding the associated community network to obtain the network structure feature vector.
Specifically, a Graph2vec algorithm may be used to perform embedding processing on the associated community network to generate a network structure feature vector, where the network structure feature vector may include attribute features, device features, time features, behavior features, and the like of user nodes in the associated community network.
S205, determining the probability value of the associated community network as the abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and the trained abnormal network identification model.
S206, if the probability value is larger than or equal to a preset probability value, determining that the target user is an abnormal user.
The descriptions of S201, S202, S205, and S206 may refer to the descriptions of S101, S102, S104, and S105, and the same technical effect can be achieved, which is not described in detail herein.
In a possible implementation manner, the determining, according to the service information submitted by the target user and the plurality of historical users, the associated community network to which the target user belongs includes:
generating a user heterogeneous information network based on the service information submitted by the target user and the plurality of historical users and the relationship between each service information and the target user and the plurality of historical users, wherein the user heterogeneous information network comprises a plurality of user nodes and a plurality of service nodes, each user node is connected with the service node associated with the user node through a connecting line, and the user heterogeneous information network is divided into a plurality of sub-information networks according to the connection relationship between the user nodes;
according to the distance between every two adjacent user nodes in the user heterogeneous information network and the number of the user nodes in each sub information network, dividing the sub information network with the number of the user nodes larger than a preset threshold value into a plurality of sub information networks with the number of the user nodes smaller than or equal to the preset threshold value;
and determining that the sub information network to which the user node corresponding to the target user belongs is the associated community network to which the target user belongs.
Specifically, the sub information network with the number of user nodes greater than the preset threshold may be divided into a plurality of sub information networks with the number of user nodes less than or equal to the preset threshold by using a community discovery algorithm, such as K-Clique, Louvain, and the like.
The sub information network is the above-mentioned connected network. In particular, a large connectivity network may include more than 50 user nodes, and a small connectivity network may include 2 to 50 user nodes.
In one possible embodiment, the method further comprises the step of training the anomaly network recognition model:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
In this step, the step of generating the associated community network samples and determining the network structure feature vector sample and the abnormal information feature vector sample of each associated community network sample may refer to the step of generating the associated community network and determining the network structure feature vector and the abnormal information feature vector of each associated community network, which is not described herein again.
Further, the state label of the associated community network sample is a true value of whether the associated community network sample is an abnormal associated community network, and specifically, the state label may be determined according to a service state of each user.
In a possible implementation manner, the determining the status label of the associated community network sample based on the service status of the user corresponding to each user node in the associated community network sample includes:
determining the average state weight of the associated community network sample based on the state weight corresponding to each service state;
and determining the state label of the associated community network sample based on the average state weight.
In this step, different service states correspond to different state weights, a service state closer to an abnormal state may correspond to a larger state weight, specifically, an average value of state weights corresponding to all user nodes in the associated community network sample may be calculated, when the average value is greater than or equal to a preset threshold value, the associated community network sample may be determined to be an abnormal network, otherwise, the associated community network sample is determined to be a normal network, and a corresponding state label is added thereto.
The abnormal user identification method provided by the embodiment of the application acquires the service information of a target user and a plurality of historical users; determining an associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users; determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network; determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model; and if the probability value is greater than or equal to a preset probability value, determining that the target user is an abnormal user. Compared with the prior art, the method and the device have the advantages that the abnormal user can be identified before the associated user generates the abnormal behavior, the predictability is high, and the accuracy of identifying the abnormal user can be improved.
Referring to fig. 3 and 4, fig. 3 is a first schematic structural diagram of an abnormal user identification device according to an embodiment of the present application, and fig. 4 is a second schematic structural diagram of an abnormal user identification device according to an embodiment of the present application. As shown in fig. 3, the abnormal user identifying apparatus 300 includes:
an obtaining module 310, configured to obtain service information of a target user and multiple historical users;
a first determining module 320, configured to determine, according to service information submitted by the target user and multiple historical users, an associated community network to which the target user belongs, where the associated community network includes multiple user nodes and multiple service nodes, each service node includes one piece of service information, each user node is connected to a service node associated with the user node through a connection line, and the connection line stores the associated information between the user node and the service node corresponding to the user node;
the second determining module 330 is configured to determine, based on the associated community network, a network structure feature vector and an abnormal information feature vector of the associated community network to which the target user belongs;
a third determining module 340, configured to determine a probability value that the associated community network is an abnormal network, using the network structure feature vector, the abnormal information feature vector, and the trained abnormal network identification model;
a fourth determining module 350, configured to determine that the target user is an abnormal user when the probability value is greater than or equal to a preset probability value.
Further, as shown in fig. 4, the first determining module 320 includes:
a generating unit 321, configured to generate a user heterogeneous information network based on service information submitted by the target user and the multiple historical users and a relationship between each piece of service information and the target user and the multiple historical users, where the user heterogeneous information network includes multiple user nodes and multiple service nodes, each user node is connected to a service node associated with the user node through a connection line, and the user heterogeneous information network is divided into multiple sub-information networks according to the connection relationship between the user nodes;
a segmenting unit 322, configured to segment, according to a distance between every two adjacent user nodes in the user heterogeneous information network and a number of user nodes in each sub information network, the sub information network in which the number of user nodes is greater than a preset threshold into a plurality of sub information networks in which the number of user nodes is less than or equal to the preset threshold;
a determining unit 323, configured to determine that the sub information network to which the user node corresponding to the target user belongs is an associated community network to which the target user belongs.
In a possible implementation, the abnormal user recognition apparatus 300 further includes a model training module 360, and the model training module 360 is configured to:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
In a possible implementation manner, the second determining module 330 is specifically configured to:
determining abnormal service nodes in the associated community network based on each user node, the service information in each service node and the associated information in each connecting line, and determining the abnormal information characteristic vector according to the service information in the abnormal service nodes;
and embedding the associated community network to obtain the network structure characteristic vector.
In a possible implementation manner, when determining the state label of the associated community network sample based on the service state of the user corresponding to each user node in the associated community network sample, the model training module 360 is specifically configured to:
determining the average state weight of the associated community network sample based on the state weight corresponding to each service state;
and determining the state label of the associated community network sample based on the average state weight.
The abnormal user identification device provided by the embodiment of the application acquires the service information of a target user and a plurality of historical users; determining an associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users; determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network; determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model; and if the probability value is greater than or equal to a preset probability value, determining that the target user is an abnormal user. Compared with the prior art, the method and the device have the advantages that the abnormal user can be identified before the associated user generates the abnormal behavior, the predictability is high, and the accuracy of identifying the abnormal user can be improved.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 5, the electronic device 500 includes a processor 510, a memory 520, and a bus 530.
The memory 520 stores machine-readable instructions executable by the processor 510, when the electronic device 500 runs, the processor 510 communicates with the memory 520 through the bus 530, and when the machine-readable instructions are executed by the processor 510, the steps of the abnormal user identification method in the method embodiments shown in fig. 1 and fig. 2 may be performed.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the abnormal user identification method in the method embodiments shown in fig. 1 and fig. 2 may be executed.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the exemplary embodiments of the present application, and are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. An abnormal user identification method, characterized in that the method comprises:
acquiring service information of a target user and a plurality of historical users;
determining an associated community network to which the target user belongs according to service information submitted by the target user and a plurality of historical users, wherein the associated community network comprises a plurality of user nodes and a plurality of service nodes, each service node comprises one piece of service information, each user node is connected with the service node associated with the user node through a connecting line, and the connecting line stores the associated information between the corresponding user node and the service node;
determining a network structure characteristic vector and an abnormal information characteristic vector of an associated community network to which a target user belongs based on the associated community network;
determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and a trained abnormal network identification model;
if the probability value is larger than or equal to a preset probability value, determining that the target user is an abnormal user;
the determining the associated community network to which the target user belongs according to the service information submitted by the target user and the plurality of historical users comprises the following steps:
generating a user heterogeneous information network based on the service information submitted by the target user and the plurality of historical users and the relationship between each service information and the target user and the plurality of historical users, wherein the user heterogeneous information network comprises a plurality of user nodes and a plurality of service nodes, each user node is connected with the service node associated with the user node through a connecting line, and the user heterogeneous information network is divided into a plurality of sub-information networks according to the connection relationship between the user nodes;
according to the distance between every two adjacent user nodes in the user heterogeneous information network and the number of the user nodes in each sub information network, dividing the sub information network with the number of the user nodes larger than a preset threshold value into a plurality of sub information networks with the number of the user nodes smaller than or equal to the preset threshold value; and determining that the sub information network to which the user node corresponding to the target user belongs is the associated community network to which the target user belongs.
2. The method according to claim 1, wherein the determining, based on the associated community network, a network structure feature vector and an abnormal information feature vector of the associated community network to which the target user belongs comprises:
determining abnormal service nodes in the associated community network based on each user node, the service information in each service node and the associated information in each connecting line, and determining the abnormal information characteristic vector according to the service information in the abnormal service nodes;
and embedding the associated community network to obtain the network structure characteristic vector.
3. The method of claim 1, further comprising the step of training the anomaly network recognition model:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
4. The method according to claim 3, wherein determining the status label of the associated community network sample based on the service status of the user corresponding to each user node in the associated community network sample comprises:
determining the average state weight of the associated community network sample based on the state weight corresponding to each service state;
and determining the state label of the associated community network sample based on the average state weight.
5. An abnormal user identification apparatus, comprising:
the acquisition module is used for acquiring the service information of a target user and a plurality of historical users;
the first determining module is used for determining an associated community network to which the target user belongs according to service information submitted by the target user and a plurality of historical users, wherein the associated community network comprises a plurality of user nodes and a plurality of service nodes, each service node comprises one item of service information, each user node is connected with the service node associated with the user node through a connecting line, and the connecting line stores the associated information between the corresponding user node and the service node;
the second determination module is used for determining a network structure characteristic vector and an abnormal information characteristic vector of the associated community network to which the target user belongs based on the associated community network;
the third determining module is used for determining the probability value of the associated community network as an abnormal network by using the network structure characteristic vector, the abnormal information characteristic vector and the trained abnormal network identification model;
the fourth determining module is used for determining the target user as an abnormal user when the probability value is greater than or equal to a preset probability value;
the first determining module includes:
a generating unit, configured to generate a user heterogeneous information network based on service information submitted by the target user and the multiple historical users and a relationship between each piece of service information and the target user and the multiple historical users, where the user heterogeneous information network includes multiple user nodes and multiple service nodes, each user node is connected to a service node associated with the user node through a connection line, and the user heterogeneous information network is divided into multiple sub-information networks according to the connection relationship between the user nodes;
the segmentation unit is used for segmenting the sub-information network with the user node number larger than a preset threshold into a plurality of sub-information networks with the user node number smaller than or equal to the preset threshold according to the distance between every two adjacent user nodes in the user heterogeneous information network and the user node number in each sub-information network;
and the determining unit is used for determining that the sub information network to which the user node corresponding to the target user belongs is the associated community network to which the target user belongs.
6. The apparatus of claim 5, further comprising a model training module to:
generating a plurality of associated community network samples based on the service information of a plurality of historical users, and determining a network structure characteristic vector sample and an abnormal information characteristic vector sample of each associated community network sample;
aiming at each associated community network sample, determining a state label of the associated community network sample based on the service state of a user corresponding to each user node in the associated community network sample;
and training the abnormal network recognition model by using the state label, the network structure characteristic vector sample and the abnormal information characteristic vector sample of each associated community network.
7. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating via the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the abnormal user identification method according to any one of claims 1 to 4.
8. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the abnormal user identification method according to any one of claims 1 to 4.
CN201910943381.XA 2019-09-30 2019-09-30 Abnormal user identification method and device Active CN110689084B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910943381.XA CN110689084B (en) 2019-09-30 2019-09-30 Abnormal user identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910943381.XA CN110689084B (en) 2019-09-30 2019-09-30 Abnormal user identification method and device

Publications (2)

Publication Number Publication Date
CN110689084A CN110689084A (en) 2020-01-14
CN110689084B true CN110689084B (en) 2022-03-01

Family

ID=69111322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910943381.XA Active CN110689084B (en) 2019-09-30 2019-09-30 Abnormal user identification method and device

Country Status (1)

Country Link
CN (1) CN110689084B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111091393B (en) * 2019-11-26 2023-09-05 汉海信息技术(上海)有限公司 Abnormal account identification method and device and electronic equipment
CN111339436B (en) * 2020-02-11 2021-05-28 腾讯科技(深圳)有限公司 Data identification method, device, equipment and readable storage medium
CN111401478B (en) * 2020-04-17 2022-10-04 支付宝(杭州)信息技术有限公司 Data anomaly identification method and device
CN113946758B (en) * 2020-06-30 2023-09-19 腾讯科技(深圳)有限公司 Data identification method, device, equipment and readable storage medium
CN115102758B (en) * 2022-06-21 2023-04-07 新余学院 Method, device, equipment and storage medium for detecting abnormal network flow

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1859224A (en) * 2005-12-31 2006-11-08 华为技术有限公司 Method and system for processing service behaviour abnormal
CN104734868A (en) * 2013-12-19 2015-06-24 中兴通讯股份有限公司 Service processing method and device among service nodes
CN107093090A (en) * 2016-10-25 2017-08-25 北京小度信息科技有限公司 Abnormal user recognition methods and device
CN107590504A (en) * 2017-07-31 2018-01-16 阿里巴巴集团控股有限公司 Abnormal main body recognition methods and device, server
CN109344326A (en) * 2018-09-11 2019-02-15 阿里巴巴集团控股有限公司 A kind of method for digging and device of social circle
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
CN109450920A (en) * 2018-11-29 2019-03-08 北京奇艺世纪科技有限公司 A kind of exception account detection method and device
CN109905411A (en) * 2019-04-25 2019-06-18 北京腾云天下科技有限公司 A kind of abnormal user recognition methods, device and calculate equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1859224A (en) * 2005-12-31 2006-11-08 华为技术有限公司 Method and system for processing service behaviour abnormal
CN104734868A (en) * 2013-12-19 2015-06-24 中兴通讯股份有限公司 Service processing method and device among service nodes
WO2015090027A1 (en) * 2013-12-19 2015-06-25 中兴通讯股份有限公司 Method and device for processing service between service nodes
CN107093090A (en) * 2016-10-25 2017-08-25 北京小度信息科技有限公司 Abnormal user recognition methods and device
CN107590504A (en) * 2017-07-31 2018-01-16 阿里巴巴集团控股有限公司 Abnormal main body recognition methods and device, server
CN109344326A (en) * 2018-09-11 2019-02-15 阿里巴巴集团控股有限公司 A kind of method for digging and device of social circle
CN109410036A (en) * 2018-10-09 2019-03-01 北京芯盾时代科技有限公司 A kind of fraud detection model training method and device and fraud detection method and device
CN109450920A (en) * 2018-11-29 2019-03-08 北京奇艺世纪科技有限公司 A kind of exception account detection method and device
CN109905411A (en) * 2019-04-25 2019-06-18 北京腾云天下科技有限公司 A kind of abnormal user recognition methods, device and calculate equipment

Also Published As

Publication number Publication date
CN110689084A (en) 2020-01-14

Similar Documents

Publication Publication Date Title
CN110689084B (en) Abnormal user identification method and device
KR102151862B1 (en) Service processing method and device
CN109347827B (en) Method, device, equipment and storage medium for predicting network attack behavior
CN110287688B (en) Associated account analysis method and device and computer-readable storage medium
CN111553488B (en) Risk recognition model training method and system for user behaviors
CN108734304B (en) Training method and device of data model and computer equipment
CN111090807B (en) Knowledge graph-based user identification method and device
CN112435137B (en) Cheating information detection method and system based on community mining
CN110224859B (en) Method and system for identifying a group
CN111078742B (en) User classification model training method, user classification method and device
CN111612085B (en) Method and device for detecting abnormal points in peer-to-peer group
CN115883187A (en) Method, device, equipment and medium for identifying abnormal information in network traffic data
CN113904943A (en) Account detection method and device, electronic equipment and storage medium
CN111078560B (en) Test method and device based on flow pruning, electronic equipment and storage medium
CN112785315B (en) Batch registration identification method and device
CN110781410A (en) Community detection method and device
CN117093627A (en) Information mining method, device, electronic equipment and storage medium
CN110070383B (en) Abnormal user identification method and device based on big data analysis
CN111401959A (en) Risk group prediction method and device, computer equipment and storage medium
CN115208938B (en) User behavior control method and device and computer readable storage medium
CN113781156B (en) Malicious order identification method, model training method, device and storage medium
CN111210279B (en) Target user prediction method and device and electronic equipment
CN115358772A (en) Transaction risk prediction method and device, storage medium and computer equipment
CN110830314B (en) Method and device for determining abnormal traffic
CN108173689B (en) Output system of load balancing data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant