CN110502697A - A kind of target user's recognition methods, device and electronic equipment - Google Patents

A kind of target user's recognition methods, device and electronic equipment Download PDF

Info

Publication number
CN110502697A
CN110502697A CN201910792207.XA CN201910792207A CN110502697A CN 110502697 A CN110502697 A CN 110502697A CN 201910792207 A CN201910792207 A CN 201910792207A CN 110502697 A CN110502697 A CN 110502697A
Authority
CN
China
Prior art keywords
user
current user
feature information
users
characteristic information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910792207.XA
Other languages
Chinese (zh)
Other versions
CN110502697B (en
Inventor
王璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Douyu Network Technology Co Ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201910792207.XA priority Critical patent/CN110502697B/en
Publication of CN110502697A publication Critical patent/CN110502697A/en
Application granted granted Critical
Publication of CN110502697B publication Critical patent/CN110502697B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of target user's recognition methods, device and electronic equipments, which comprises the characteristic information of user is determined based on the internet behavior of user;Determine that active user is associated with the similarity between user according to the characteristic information of active user;It is updated according to the similarity based on characteristic information of the feature propagation rule to active user, the characteristic information of the association user of active user is polymerize into the characteristic information of active user, the updated characteristic information of active user is obtained;Determine whether active user is target user according to the characteristic information template of the updated characteristic information of active user and setting.By using above-mentioned technical proposal, realizes and target user is accurately identified.

Description

Target user identification method and device and electronic equipment
Technical Field
The embodiment of the invention relates to the field of computers, in particular to a target user identification method and device and electronic equipment.
Background
In order to obtain benefits, popular cheating behaviors such as barrage brushing and attention brushing exist on live websites. The cheating behaviors usually cause the problems of network blockage, overlarge pressure of a live platform server and the like, and great influence is caused on the live ecological environment of the platform. Therefore, in order to reduce the negative influence caused by the cheating behaviors, a reasonable method is adopted to find out the user with the cheating suspicion, and the significance of taking proper countermeasures for the user is great.
Disclosure of Invention
The invention provides a target user identification method, a target user identification device and electronic equipment, and accurate identification of a target user is realized.
In a first aspect, an embodiment of the present invention provides a target user identification method, where the method includes:
determining characteristic information of the user based on the online behavior of the user;
determining the similarity between the current user and the associated user according to the characteristic information of the current user;
updating the feature information of the current user based on a feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user to obtain the updated feature information of the current user;
and determining whether the current user is a target user according to the updated feature information of the current user and the set feature information template.
In a second aspect, an embodiment of the present invention provides an apparatus for identifying a target user, where the apparatus includes:
the characteristic information determining module is used for determining the characteristic information of the user based on the online behavior of the user;
the similarity determining module is used for determining the similarity between the current user and the associated user according to the characteristic information of the current user;
the updating module is used for updating the feature information of the current user based on the feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user and obtain the updated feature information of the current user;
and the identification module is used for determining whether the current user is the target user according to the updated feature information of the current user and the set feature information template.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a first memory, a first processor, and a computer program stored in the memory and executable on the first processor, where the first processor implements the target user identification method according to the first aspect when executing the computer program.
In a fourth aspect, embodiments of the present invention provide a storage medium containing computer-executable instructions which, when executed by a computer processor, implement the target user identification method according to the first aspect described above.
The target user identification method provided by the embodiment of the invention determines the characteristic information of a user through the online behavior based on the user, and determines the similarity between the current user and the associated user according to the characteristic information of the current user; and further updating the feature information of the current user based on the feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user to obtain the updated feature information of the current user, so that the feature information of the current user is mined, and finally, whether the current user is a target user is determined according to the updated feature information of the current user and a set feature information template, so that the target user is accurately identified.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the contents of the embodiments of the present invention and the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a target user identification method according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a target user identification method according to a second embodiment of the present invention;
fig. 3 is a schematic diagram of a user relationship diagram according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of a target user identification apparatus according to a third embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention.
Detailed Description
In order to make the technical problems solved, technical solutions adopted and technical effects achieved by the present invention clearer, the technical solutions of the embodiments of the present invention will be described in further detail below with reference to the accompanying drawings, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Fig. 1 is a schematic flow chart of a target user identification method according to an embodiment of the present invention. The target user identification method disclosed in this embodiment may be applicable to identifying malicious users who engage in online cheating behaviors, for example, identifying users who perform online cheating behaviors such as barrage brushing and attention brushing in a live broadcast room. The method may be performed by a target user identification device, wherein the device may be implemented by software and/or hardware and is typically integrated in a terminal, such as a server or the like. Referring specifically to fig. 1, the method comprises the steps of:
step 110, determining the characteristic information of the user based on the online behavior of the user.
The online behavior may be a positive behavior worthy of advocation, such as online donation, or a negative behavior requiring resistance, such as a live broadcast platform for swiping a bullet screen or a live broadcast platform for swiping a focus. The negative behavior that needs to be resisted often has some negative effects, for example, the above-mentioned act of swiping a bullet screen through the live platform for the same anchor or swiping attention through the live platform for the same anchor often causes problems of network congestion, overstressing of the live platform server, and the like. Therefore, in order to reduce the negative impact caused by the pop-up screen brushing behavior or the attention brushing behavior or to actively advocate the pursuit of beneficial behavior, the present embodiment discloses a target user identification method, which is used for identifying a malicious user who is engaged in the pop-up screen brushing behavior or the attention brushing behavior, so as to warn or take other measures to stop the malicious user, or identifying a user who is engaged in public welfare behaviors such as donation, and then showing the malicious user, so as to create good social atmosphere.
The embodiment takes the example of identifying malicious users who are engaged in online cheating behaviors such as a bullet screen brushing behavior or a attention brushing behavior based on a live broadcast platform as an example. The online behavior further comprises a behavior of logging in the live platform or checking in at the live platform. The characteristic information includes: the method comprises the steps that at least one of the registration time of an account used by a user for logging in a live broadcast platform, the registration source of the account, the check-in times and the login times of the user, terminal equipment used for logging in the live broadcast platform and a mobile phone number related to the account is selected. It is understood that the more dimensions the feature information has, the higher the recognition accuracy.
And step 120, determining the similarity between the current user and the associated user according to the characteristic information of the current user.
The associated users of the current user specifically include neighbor users of the current user and neighbor users of the neighbor users. The neighbor users of the current user may specifically include users with the same mobile phone number associated with the current user account, or users who log in the live broadcast platform through the same terminal device as the current user.
And determining the similarity between the current user and the associated user according to the characteristic information of the current user, specifically calculating the similarity between the characteristic information of the current user and the characteristic information of the associated user. For example, the feature information includes dimension information of account registration time and login times, where the account registration time of the current user is yesterday, the account registration time of the associated user is yesterday, the login times of the current user is 3, and the login times of the associated user is 6, it can be roughly considered that the similarity between the current user and the associated user is 0.5. The similarity between the current user and its associated users may also be calculated based on other rules.
And step 130, updating the feature information of the current user based on the feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user, and obtaining the updated feature information of the current user.
Specifically, the essence of updating the feature information of the current user based on the feature propagation rule according to the similarity to aggregate the feature information of the associated user of the current user into the feature information of the current user is that the feature information of the current user is mined based on the feature information of the associated user of the current user, that is, the feature information of the associated user of the current user is superimposed on the feature information of the current user based on a certain rule, so that the feature information of the current user not only retains the features of the current user, but also adds the features which may possibly occur to the current user, that is, the features which may not occur to the current user but may occur in the future, so as to achieve the purpose of enriching the feature information of the current user.
And step 140, determining whether the current user is the target user according to the updated feature information of the current user and the set feature information template.
The set characteristic information template specifically refers to characteristic information of a known target user, for example, if the target user is a malicious user who brushes a bullet screen on the basis of a live broadcast platform, the characteristic information specifically includes some historical behavior information of the user on the basis of the live broadcast platform. By comparing the feature information of the current user with the feature information of the known target user, if the similarity between the feature information of the current user and the feature information of the known target user is higher, the probability that the current user is the target user is higher.
In the method for identifying a target user provided by this embodiment, feature information of a user is determined based on an online behavior of the user, a similarity between a current user and an associated user is determined according to the feature information of the current user, then the feature information of the current user is updated based on a feature propagation rule according to the similarity between the users, so that the feature information of the associated user of the current user is aggregated into the feature information of the current user, mining of the feature information of the current user is realized, finally, the updated feature information of the current user is compared with a feature information template, whether the current user is the target user is determined according to a comparison result, and accurate identification of the target user is realized. The execution subject of the target user identification method provided by the embodiment is preferably the server, because the server can more conveniently acquire the characteristic information of the user.
Example two
Fig. 2 is a schematic flow chart of a target user identification method according to a second embodiment of the present invention, on the basis of the second embodiment, this embodiment provides a specific implementation manner for the above steps, and specifically refer to fig. 2, where the method includes:
step 210, constructing a user relationship diagram according to the online behavior of the current user, and determining the associated user of the current user based on the user relationship diagram.
The user relationship graph is used for reflecting the association relationship between the users. For example, each user is regarded as an independent vertex, if two users have a friend relationship with each other, the vertices corresponding to the two users are connected by a line, if the number of the lines between the current user and the other users is more, the number of the users having the friend relationship with the current user is more, and the like, and of course, the association relationship between the users may also be established from other dimensions.
Illustratively, the building of the user relationship graph according to the online behaviors of the current user includes:
determining all users performing online behaviors in a set time period;
taking each user in all the users as a vertex;
connecting vertexes corresponding to users performing the online behaviors based on the same terminal equipment and/or the same mobile phone number in a set time period through a sideline to generate a undirected user relation graph;
wherein the all users comprise current users;
wherein, the set time period may be a specific day, a specific week or a specific month. The online behavior may be, for example, logging in or registering an account of the live platform, and the all users specifically refer to accounts that appear on the live platform in a set time period, including an account that logs in the live platform and an account that registers the live platform. Specifically, vertexes corresponding to users who log in or register to the same live broadcast platform based on the same terminal device within a set time period are connected through a sideline to generate a undirected user relationship graph; or connecting vertexes corresponding to users who log in or register the same live broadcast platform based on the same IP address in a set time period through a sideline to generate a undirected user relation graph; or connecting vertexes corresponding to users who log in or register the same live broadcast platform by using the same IP address based on the same terminal equipment in a set time period through a sideline to generate a undirected user relation graph; or connecting vertexes corresponding to users who log in or register the same live broadcast platform based on the same mobile phone number in a set time period through a sideline.
Taking logging in a live broadcast platform based on the same terminal device within a set time period as an example, referring to a schematic diagram of a user relationship diagram shown in fig. 3, it is assumed that a user 1 and a user 2 log in the live broadcast platform based on a terminal device a within the set time period, and the user 1 and a user 8 log in the live broadcast platform based on a terminal device B within the set time period, and therefore a vertex 1 corresponding to the user 1 is connected with a vertex 2 corresponding to the user 2 and a vertex 8 corresponding to the user 8 through edge lines respectively; assuming that the user 2 and the user 3 log in the direct broadcasting platform based on the terminal device C within the set time period, so that the vertex 2 corresponding to the user 2 and the vertex 3 corresponding to the user 3 are connected by a side line, and the user 2 and the user 5 log in the direct broadcasting platform based on the terminal device D within the set time period, so that the vertex 2 corresponding to the user 2 and the vertex 5 corresponding to the user 5 are connected by a side line; by analogy, an undirected graph as shown in fig. 3 is obtained.
Correspondingly, the determining the associated user of the current user based on the user relationship graph includes:
in the user relation graph, determining users corresponding to vertexes with connecting lines between vertexes corresponding to the current user as neighbor users of the current user;
and determining the neighbor users of the current user as the associated users of the current user.
In the undirected graph shown in fig. 3, since the user 1 and the user 2 use the same terminal device a to log in the broadcast platform within a set period of time, the users having the above relationship are referred to as neighbor users. Then, the neighbor users of user 1 include user 2 and user 8, the neighbor users of user 2 include user 1, user 3 and user 5, and the neighbor users of user 8 include user 1, user 6 and user 9.
Illustratively, the determining all users performing online activities within a set time period includes:
and collecting a user behavior log based on behavior dotting to determine the user performing the specific online behavior within a set time period.
The behavior dotting is to insert a dot-burying code into a place (such as a click event and page jump) where a user behavior needs to be buried in a project for statistics, then the online behavior of the user is recorded in a user behavior log, the user who performs the online behavior can be determined by collecting the user behavior log and inquiring the user behavior, and the network environment information and the terminal device information used by the user to perform the online behavior are also recorded in the user behavior log. The user behavior log can be obtained at a mobile terminal (such as a smart phone) directly through a data acquisition interface.
The mining of the association relation between the users is realized by combining the equipment condition used by the users for performing online behaviors and the condition of the mobile phone numbers or the IP addresses.
Step 220, determining the characteristic information of the current user based on the online behavior of the current user, and determining the characteristic information of the associated user based on the online behavior of the associated user of the current user.
And step 230, determining the similarity between the current user and the associated user according to the characteristic information of the current user.
Illustratively, the similarity between the current user and its associated user is determined according to the following formula:
where sim (i, k) represents the similarity between user i and its associated user k, xijFor indicatingJ-th dimension characteristic information, x, of the user ikjRepresents j-th dimension characteristic information of the user k, m represents a dimension of the characteristic information of the user, Γ () is a gamma function, and γ () is an incomplete gamma function.
The above formula (1) is designed based on the chi-square test principle in statistics, the chi-square test is used for counting the deviation degree between the actual observed value and the theoretical inferred value of the sample, and the formula of the chi-square test isWherein A is an actual observed value, T is a theoretical inferred value, and x2Is used for measuring the difference degree between the actual observed value and the theoretical inferred value. The embodiment adopts the idea to measure the similarity of the user characteristic information. Specifically, the feature information of the user i is regarded as an actual observed value, the feature information of the user k is regarded as a theoretical inferred value, and then the test statistic is calculatedAnd (3) substituting the probability accumulation function of chi-square distribution to calculate the difference degree between the actual observed value and the theoretical inferred value, wherein the value of the difference degree is between 0 and 1, and then subtracting the difference degree by 1 to obtain the similarity between the actual observed value and the theoretical inferred value. The advantage of designing the above similarity calculation formula (1) based on the chi-square test principle is that the measurement of similarity for feature information of different dimensions becomes more objective.
The essence of determining the similarity between the current user and the associated user according to the feature information of the current user is to determine the possibility that the current user has the feature information of the associated user, that is, the possibility that the current user has online behavior generated by the associated user, so as to predict and mine the online behavior of the current user.
And 240, taking the feature information of the vertex corresponding to the current user in the user relationship graph as the initialization feature information of the current user.
Specifically, the user characteristic information collected from the user behavior log is determined as the initialization characteristic information of the user.
And 250, updating the initialized feature information of the current user based on the initialized feature information of the current user, the similarity between the current user and the neighbor users thereof, the number of the neighbor users of the current user and the number of the neighbor users, so as to aggregate the feature information of the neighbor users of the current user and the feature information of the neighbor users into the feature information of the current user, and obtain the updated feature information of the current user.
Illustratively, the feature information of the current user is updated according to the following formula:
wherein,represents updated j-th dimension characteristic information of the current user i,representing j-dimension characteristic information before updating of a current user i, t representing the number of updating iterations, sim (i, k) representing the similarity between the current user i and a neighbor user k thereof, n (i) representing a neighbor set of the current user i, | n (i) representing the number of elements in the neighbor set, | n (k) representing the neighbor set of the user k, | n (k) representing the number of elements in the neighbor set.
The design principle of the above formula (2) is derived from a laplacian matrix, which is a kind of matrix describing a graph. In an undirected graph, corresponding edges are connected between vertexes of the undirected graph, the similarity of two points is measured by adopting a certain index, such as Euclidean distance or Gaussian similarity, a matrix formed by the similarities between every two points is W, the similarity between the two vertexes without the connected edges can be understood as 0, the matrix W is a symmetric matrix, the matrix formed by the sum of the similarities of a certain point and all points is marked as D, the matrix D is a diagonal matrix, and the Laplace matrix L is a Laplace matrix LIs L ═ D-0.5WD-0.5. The laplacian matrix itself represents a way of normalizing the transform, multiplying the laplacian matrix by the user's feature matrix, formally a normalization operation on the neighbor user's features. By operation of matrix operation L H, designSuch a normalization is performed. Since the feature contributions of neighboring nodes are different, the normalized feature needs to be multiplied by a weight, and the weight can be measured by the similarity.
Sim (i, k) in this embodiment represents the similarity between user i and user k,initialized feature information of j-th dimension of user, the normalized transformationThe method has the advantage of preventing the influence of the feature information of the neighbor users from being too large, and avoiding the feature information of all nodes from being too similar after propagation, thereby losing application value. After summing, the summation result needs to be mapped between 0 and 1 to control the value range, and the mapping method adopted in this embodiment is a pass functionAnd carrying out nonlinear transformation on the summation result. The characteristic information of the current user is updated through the formula (2), so that the characteristic information of the current user i is reservedThe feature information of the neighbor user is added, so that the purpose of fusing the feature information of the neighbor user of the current user i to the feature information of the current user i is achieved, namely the purpose of enriching the feature information of the current user by adding features which are not generated by the current user but may be generated in the future is achieved.
The feature information of each user after iteration is greatly different from the initial feature information, and the main difference lies in that the feature information after iteration not only comprises the features of the user, but also aggregates the features of neighbor users, and accords with the original purpose of feature aggregation. The value of the number t of updating iterations is usually 1-4, the value is based on the average neighbor number of each user node in the user relationship graph, the larger the average neighbor number is, the stronger the propagation effect of the characteristics is, and if the value of the number t of updating iterations is too large, the characteristic information of each user node after the iteration tends to be consistent, so that the application value is lost. The value criteria in business practice are: if the average neighbor number is greater than 10, the number t of updating iterations is 1; if the average neighbor number is more than 5 and less than or equal to 10, the number t of updating iterations is 2; if the average neighbor number is more than 2 and less than or equal to 5, the number t of updating iterations is 3; if the average neighbor number is less than or equal to 2, the number t of update iterations is 4.
And step 260, calculating the distance between the updated feature information of the current user and the feature information template.
Illustratively, the distance between the feature information updated by the current user and the feature information template is calculated according to the following formula:
wherein, TiRepresenting the distance x 'between the updated feature information of the current user i and the feature information template'ijRepresenting updated j-th dimension characteristic information of the current user i, m representing dimension of the updated characteristic information of the current user i, xsjJ-th dimension feature information of the feature information template corresponding to the known target users s is represented, and n represents the number of the known target users.
The above formula (3) is the idea of using cosine distanceCalculating updated feature information x 'of current user i'ijAnd characteristic information template xsjThe distance between the two or more of the two or more,represents the average features of the feature information template. If the current user i is updated with the feature information x'ijAnd characteristic information template xsjThe smaller the cosine between the two is, the closer the updated feature information of the current user i is to the feature information template is, the higher the possibility that the current user i is the target user is. The characteristic information target is the characteristic information of a known target user.
And step 270, determining whether the current user is the target user according to the distance.
The above target user identification method is exemplified:
assuming that a current user is marked as a user 1, a neighbor user of the current user is marked as a user 2, and the characteristic information comprises two dimensions;
the value of the feature information of the first dimension of the user 1 is 1, and the value of the feature information of the second dimension is 5;
the value of the feature information of the first dimension of the user 2 is 2, and the value of the feature information of the second dimension is 3;
based on the above known information, the similarity between the user 1 and the user 2 is calculated by the above formula (1):
the feature information of the current user, i.e. user 1, is updated by the above formula (2), and if only one iteration of updating is performed, the feature information can be known from the above known informationSubstituting it into the above equation (2) yields:
that is, the updated characteristic information of the current user is
Assuming that the average feature information of the known malicious user is (0.8 ), that is, the average feature information of the feature information template is (0.8 ), the distance between the updated feature information of the current user and the feature information template can be obtained by the above formula (3):
if the threshold is set to 0.9, since 0.99 is greater than 0.9, the current user is a suspected malicious user, that is, a user who has a suspected bad online behavior.
According to the target user identification method provided by the embodiment, the calculation formula of the similarity between the current user and the associated user is determined by referring to the chi-square test principle, so that the measurement of the similarity between the users becomes more objective; determining an algorithm for updating the feature information of the current user by referring to the Laplace matrix, and mining the feature information of the current user; the comparison result between the updated feature information of the current user and the feature information template is determined by referring to the calculation formula of the cosine distance, so that the identification accuracy of the target user is improved.
EXAMPLE III
Fig. 4 is a schematic structural diagram of a target user identification device according to a third embodiment of the present invention. Referring to fig. 4, the apparatus comprises: a characteristic information determination module 410, a similarity determination module 420, an update module 430 and an identification module 440;
the characteristic information determining module 410 is configured to determine characteristic information of a user based on online behavior of the user; a similarity determining module 420, configured to determine, according to the feature information of the current user, a similarity between the current user and a user associated with the current user; the updating module 430 is configured to update the feature information of the current user based on a feature propagation rule according to the similarity, so as to aggregate the feature information of the associated user of the current user into the feature information of the current user, and obtain updated feature information of the current user; the identifying module 440 is configured to determine whether the current user is a target user according to the updated feature information of the current user and the set feature information template.
Further, the apparatus further comprises:
the building module is used for building a user relation graph according to the online behavior of the current user;
and the associated user determining module is used for determining the associated user of the current user based on the user relationship graph.
Further, the building module comprises:
the determining unit is used for determining all users performing online behaviors in a set time period; taking each user in all the users as a vertex;
the generating unit is used for connecting vertexes corresponding to the users performing the online behaviors based on the same terminal equipment and/or the same mobile phone number in a set time period through a sideline to generate a undirected user relation graph;
wherein the all users comprise current users;
further, the associated user determining module is specifically configured to:
in the user relation graph, determining users corresponding to vertexes with connecting lines between vertexes corresponding to the current user as neighbor users of the current user; and determining the neighbor users of the current user as the associated users of the current user.
Further, the update module 430 includes:
the initialization unit is used for taking the characteristic information of the vertex corresponding to the current user in the user relation graph as the initialization characteristic information of the current user;
and the updating unit is used for updating the initialized feature information of the current user based on the initialized feature information of the current user, the similarity between the current user and the neighbor users thereof, the number of the neighbor users of the current user and the number of the neighbor users so as to aggregate the feature information of the neighbor users of the current user and the feature information of the neighbor users into the feature information of the current user and obtain the updated feature information of the current user.
Further, the update unit is specifically configured to: updating the characteristic information of the current user according to the following formula:
wherein,represents updated j-th dimension characteristic information of the current user i,representing j-dimension characteristic information before updating of a current user i, t representing the number of updating iterations, sim (i, k) representing the similarity between the current user i and a neighbor user k thereof, n (i) representing a neighbor set of the current user i, | n (i) representing the number of elements in the neighbor set, | n (k) representing the neighbor set of the user k, | n (k) representing the number of elements in the neighbor set.
Further, the similarity determining module 420 is specifically configured to:
determining the similarity between the current user and the associated user according to the following formula:
where sim (i, k) represents the similarity between user i and its associated user k, xijJ-th dimension characteristic information, x, representing user ikjRepresents j-th dimension characteristic information of the user k, m represents a dimension of the characteristic information of the user, Γ () is a gamma function, and γ () is an incomplete gamma function.
Further, the identification module 440 includes:
the calculating unit is used for calculating the distance between the updated feature information of the current user and the feature information template;
and the determining unit is used for determining whether the current user is the target user according to the distance.
Further, the computing unit is specifically configured to:
calculating the distance between the updated feature information of the current user and the feature information template according to the following formula:
wherein, TiRepresenting the distance x 'between the updated feature information of the current user i and the feature information template'ijRepresenting updated j-th dimension characteristic information of the current user i, m representing dimension of the updated characteristic information of the current user i, xsjJ-th dimension feature information of the feature information template corresponding to the known target users s is represented, and n represents the number of the known target users.
The target user identification device provided in this embodiment determines the feature information of the user based on the online behavior of the user, and determines the similarity between the current user and the associated user according to the feature information of the current user; and further updating the feature information of the current user based on the feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user to obtain the updated feature information of the current user, so that the feature information of the current user is mined, and finally, whether the current user is a target user is determined according to the updated feature information of the current user and a set feature information template, so that the target user is accurately identified.
Example four
Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention. Fig. 5 illustrates a block diagram of an exemplary device 12 suitable for use in implementing embodiments of the present invention. The device 12 shown in fig. 5 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present invention.
As shown in FIG. 5, device 12 is in the form of a general purpose computing device. The components of device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.
Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by device 12 and includes both volatile and nonvolatile media, removable and non-removable media.
The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. Device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. Memory 28 may include at least one program product having a set of program modules (e.g., feature information determination module 410, similarity determination module 420, update module 430, and identification module 440 in a target subscriber identification device) configured to perform the functions of embodiments of the present invention.
A program/utility 40 having a set of program modules 42 (e.g., feature information determination module 410, similarity determination module 420, update module 430, and identification module 440 in a target subscriber identity device) may be stored, for example, in memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may include an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.
Device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with device 12, and/or with any devices (e.g., network card, modem, etc.) that enable device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown, the network adapter 20 communicates with the other modules of the device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
The processing unit 16 executes various functional applications and data processing, such as implementing a target user identification method provided by an embodiment of the present invention, by executing programs stored in the system memory 28.
EXAMPLE five
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a target user identification method, including:
determining characteristic information of the user based on the online behavior of the user;
determining the similarity between the current user and the associated user according to the characteristic information of the current user;
updating the feature information of the current user based on a feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user to obtain the updated feature information of the current user;
and determining whether the current user is a target user according to the updated feature information of the current user and the set feature information template.
Of course, the storage medium containing the computer-executable instructions provided by the embodiments of the present invention is not limited to the method operations described above, and may also perform the target user identification related operations provided by any embodiments of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a storage medium, or a network device) to execute the embodiments of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A target user identification method is characterized by comprising the following steps:
determining characteristic information of the user based on the online behavior of the user;
determining the similarity between the current user and the associated user according to the characteristic information of the current user;
updating the feature information of the current user based on a feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user to obtain the updated feature information of the current user;
and determining whether the current user is a target user according to the updated feature information of the current user and the set feature information template.
2. The method of claim 1, wherein before determining the similarity between the current user and the associated user according to the feature information of the current user, further comprising:
constructing a user relation graph according to the online behavior of the current user;
and determining the associated user of the current user based on the user relation graph.
3. The method of claim 2, wherein constructing the user relationship graph according to the online behavior of the current user comprises:
determining all users performing online behaviors in a set time period;
taking each user in all the users as a vertex;
connecting vertexes corresponding to users performing the online behaviors based on the same terminal equipment and/or the same mobile phone number in a set time period through a sideline to generate a undirected user relation graph;
wherein the all users comprise current users;
correspondingly, the determining the associated user of the current user based on the user relationship graph includes:
in the user relation graph, determining users corresponding to vertexes with connecting lines between vertexes corresponding to the current user as neighbor users of the current user;
and determining the neighbor users of the current user as the associated users of the current user.
4. The method according to claim 3, wherein the updating the feature information of the current user based on the feature propagation rule according to the similarity to aggregate the feature information of the associated user of the current user into the feature information of the current user, and obtaining the updated feature information of the current user comprises:
taking the characteristic information of the vertex corresponding to the current user in the user relationship graph as the initialization characteristic information of the current user;
updating the initialization characteristic information of the current user based on the initialization characteristic information of the current user, the similarity between the current user and the neighbor users thereof, the number of the neighbor users of the current user and the number of the neighbor users, so as to aggregate the characteristic information of the neighbor users of the current user and the characteristic information of the neighbor users into the characteristic information of the current user and obtain the updated characteristic information of the current user.
5. The method of claim 4, wherein updating the initialization feature information of the current user based on the initialization feature information of the current user, the similarity between the current user and the neighbor users thereof, the number of neighbor users of the current user, and the number of neighbor users of the neighbor users comprises:
updating the characteristic information of the current user according to the following formula:
wherein,represents updated j-th dimension characteristic information of the current user i,representing j-dimension characteristic information before updating of a current user i, t representing the number of updating iterations, sim (i, k) representing the similarity between the current user i and a neighbor user k thereof, n (i) representing a neighbor set of the current user i, | n (i) representing the number of elements in the neighbor set, | n (k) representing the neighbor set of the user k, | n (k) representing the number of elements in the neighbor set.
6. The method according to any one of claims 1 to 5, wherein determining the similarity between the current user and its associated user according to the feature information of the current user comprises:
determining the similarity between the current user and the associated user according to the following formula:
where sim (i, k) represents the similarity between user i and its associated user k, xijJ-th dimension characteristic information, x, representing user ikjRepresents j-th dimension characteristic information of the user k, m represents a dimension of the characteristic information of the user, Γ () is a gamma function, and γ () is an incomplete gamma function.
7. The method according to any one of claims 1 to 5, wherein determining whether the current user is the target user according to the updated feature information of the current user and the set feature information template comprises:
calculating the distance between the updated feature information of the current user and the feature information template;
and determining whether the current user is the target user or not according to the distance.
8. The method of claim 7, wherein calculating the distance between the updated feature information of the current user and the feature information template comprises:
calculating the distance between the updated feature information of the current user and the feature information template according to the following formula:
wherein, TiRepresenting the distance x 'between the updated feature information of the current user i and the feature information template'ijRepresenting updated j-th dimension characteristic information of the current user i, m representing dimension of the updated characteristic information of the current user i, xsjJ-th dimension feature information of the feature information template corresponding to the known target users s is represented, and n represents the number of the known target users.
9. An apparatus for identifying a target user, the apparatus comprising:
the characteristic information determining module is used for determining the characteristic information of the user based on the online behavior of the user;
the similarity determining module is used for determining the similarity between the current user and the associated user according to the characteristic information of the current user;
the updating module is used for updating the feature information of the current user based on the feature propagation rule according to the similarity so as to aggregate the feature information of the associated user of the current user into the feature information of the current user and obtain the updated feature information of the current user;
and the identification module is used for determining whether the current user is the target user according to the updated feature information of the current user and the set feature information template.
10. An electronic device comprising a first memory, a first processor and a computer program stored on the memory and executable on the first processor, wherein the first processor implements the target user identification method as claimed in any one of claims 1 to 8 when executing the computer program.
CN201910792207.XA 2019-08-26 2019-08-26 Target user identification method and device and electronic equipment Active CN110502697B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910792207.XA CN110502697B (en) 2019-08-26 2019-08-26 Target user identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910792207.XA CN110502697B (en) 2019-08-26 2019-08-26 Target user identification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN110502697A true CN110502697A (en) 2019-11-26
CN110502697B CN110502697B (en) 2022-06-21

Family

ID=68589726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910792207.XA Active CN110502697B (en) 2019-08-26 2019-08-26 Target user identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN110502697B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401959A (en) * 2020-03-18 2020-07-10 多点(深圳)数字科技有限公司 Risk group prediction method and device, computer equipment and storage medium
CN111506802A (en) * 2020-03-16 2020-08-07 中国平安人寿保险股份有限公司 User information correction method and device, computer equipment and storage medium
CN111800647A (en) * 2020-06-29 2020-10-20 广州市百果园信息技术有限公司 Live broadcast and live broadcast matching method, device, equipment and storage medium
CN113468453A (en) * 2020-03-30 2021-10-01 武汉斗鱼网络科技有限公司 Target user identification method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789887A (en) * 2009-12-25 2010-07-28 成都市华为赛门铁克科技有限公司 Method and device for classifying network users and system for monitoring network services
CN104834967A (en) * 2015-04-24 2015-08-12 南京邮电大学 User similarity-based business behavior prediction method under ubiquitous network
CN104850645A (en) * 2015-05-28 2015-08-19 苏州大学张家港工业技术研究院 Active learning grading guiding method and active learning grading guiding system based on matrix decomposition
CN106407455A (en) * 2016-09-30 2017-02-15 深圳市华傲数据技术有限公司 Data processing method and device based on graph data mining
CN106998262A (en) * 2016-10-10 2017-08-01 深圳汇网天下科技有限公司 A kind of System and method for for recognizing Internet user
CN108763359A (en) * 2018-05-16 2018-11-06 武汉斗鱼网络科技有限公司 A kind of usage mining method, apparatus and electronic equipment with incidence relation
CN109255371A (en) * 2018-08-23 2019-01-22 武汉斗鱼网络科技有限公司 A kind of method and relevant device of determining live streaming platform falseness concern user

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101789887A (en) * 2009-12-25 2010-07-28 成都市华为赛门铁克科技有限公司 Method and device for classifying network users and system for monitoring network services
CN104834967A (en) * 2015-04-24 2015-08-12 南京邮电大学 User similarity-based business behavior prediction method under ubiquitous network
CN104850645A (en) * 2015-05-28 2015-08-19 苏州大学张家港工业技术研究院 Active learning grading guiding method and active learning grading guiding system based on matrix decomposition
CN106407455A (en) * 2016-09-30 2017-02-15 深圳市华傲数据技术有限公司 Data processing method and device based on graph data mining
CN106998262A (en) * 2016-10-10 2017-08-01 深圳汇网天下科技有限公司 A kind of System and method for for recognizing Internet user
CN108763359A (en) * 2018-05-16 2018-11-06 武汉斗鱼网络科技有限公司 A kind of usage mining method, apparatus and electronic equipment with incidence relation
CN109255371A (en) * 2018-08-23 2019-01-22 武汉斗鱼网络科技有限公司 A kind of method and relevant device of determining live streaming platform falseness concern user

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111506802A (en) * 2020-03-16 2020-08-07 中国平安人寿保险股份有限公司 User information correction method and device, computer equipment and storage medium
CN111401959A (en) * 2020-03-18 2020-07-10 多点(深圳)数字科技有限公司 Risk group prediction method and device, computer equipment and storage medium
CN111401959B (en) * 2020-03-18 2023-09-29 多点(深圳)数字科技有限公司 Risk group prediction method, apparatus, computer device and storage medium
CN113468453A (en) * 2020-03-30 2021-10-01 武汉斗鱼网络科技有限公司 Target user identification method and device, electronic equipment and storage medium
CN113468453B (en) * 2020-03-30 2022-09-09 武汉斗鱼网络科技有限公司 Target user identification method and device, electronic equipment and storage medium
CN111800647A (en) * 2020-06-29 2020-10-20 广州市百果园信息技术有限公司 Live broadcast and live broadcast matching method, device, equipment and storage medium
CN111800647B (en) * 2020-06-29 2022-08-09 广州市百果园信息技术有限公司 Live broadcast and live broadcast matching method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN110502697B (en) 2022-06-21

Similar Documents

Publication Publication Date Title
CN110502697B (en) Target user identification method and device and electronic equipment
CN110177094B (en) User group identification method and device, electronic equipment and storage medium
CN110992169B (en) Risk assessment method, risk assessment device, server and storage medium
CN110046929B (en) Fraudulent party identification method and device, readable storage medium and terminal equipment
CN113360580B (en) Abnormal event detection method, device, equipment and medium based on knowledge graph
CN110135978B (en) User financial risk assessment method and device, electronic equipment and readable medium
WO2022021977A1 (en) Underground industry account detection method and apparatus, computer device, and medium
CN111666346B (en) Information merging method, transaction inquiring method, device, computer and storage medium
CN112333196B (en) Attack event tracing method and device, electronic equipment and storage medium
WO2017013529A1 (en) System and method for determining credit worthiness of a user
US20210049281A1 (en) Reducing risk of smart contracts in a blockchain
WO2016145993A1 (en) Method and system for user device identification
US11803657B2 (en) Generation of representative data to preserve membership privacy
CN113010896A (en) Method, apparatus, device, medium and program product for determining an abnormal object
WO2020151321A1 (en) Graph computation-based claim anti-fraud method, apparatus and device, and storage medium
CN112819611A (en) Fraud identification method, device, electronic equipment and computer-readable storage medium
WO2017157165A1 (en) Credit-score model training method, and credit-score calculation method, device, and server
CN109714636A (en) A kind of user identification method, device, equipment and medium
CN114677565A (en) Training method of feature extraction network and image processing method and device
CN110188262A (en) A kind of abnormal object determines method, apparatus, equipment and medium
WO2019232821A1 (en) Method for processing risk control data, device, computer apparatus, and storage medium
CN113379469A (en) Abnormal flow detection method, device, equipment and storage medium
CN111476668B (en) Identification method and device of credible relationship, storage medium and computer equipment
US20180068122A1 (en) Transmission of trustworthy data
CN114202409A (en) Guarantee map construction method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant