Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" may be used is interpreted as "at … …," or "when … …," or "in response to a determination.
The embodiment of the present application provides an account status identification method, which may be applied to a server, where the server may be a Personal Computer (PC), a notebook Computer, a data platform, an e-commerce platform, and the like, and the type of the server is not limited, and all devices having an account status identification function are within the protection scope of the embodiment of the present application. Referring to fig. 1, which is a flowchart of an account status identification method in an embodiment of the present application, the account status identification method may include the following steps:
step 101, at least one scene identifier and at least one feature identifier corresponding to a first account are obtained.
And 102, acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier.
Step 103, obtaining the link strength of the first account and the second account by using the scene weight, the feature weight and the feature value. The link strength refers to the degree of association between the first account number and the second account number.
And 104, identifying the state of the second account through the state of the first account and the link strength.
In an example, the execution sequence is only an example given for convenience of description, and in practical applications, the execution sequence between the steps may also be changed, and the execution sequence is not limited.
In one example, the first account number may be any account number with a known status, and the status of the first account number may be normal or abnormal. When the state of the first account number is normal, the operation that the first account number does not generate false transactions and the like is shown, and when the state of the first account number is abnormal, the operation that the first account number generates false transactions and the like is shown.
In practical applications, the number of the first account numbers may be one or more, and for convenience of description, in the following process, one first account number is taken as an example for explanation. Furthermore, each first account has a unique account identification, also known as a user name, such as "123123", "abcabc", etc.
In summary, for step 101, in an example, for a process of "obtaining at least one scene identifier and at least one feature identifier corresponding to a first account", an account identifier may be obtained from a database (used for storing all user data, where the user data includes account identifiers, statuses, scene identifiers, feature identifiers, and the like), and if a status corresponding to the account identifier exists in the database, the account identifier is determined as the first account, and at least one scene identifier and at least one feature identifier corresponding to the account identifier are obtained from the database; and if the state corresponding to the account identifier does not exist in the database, not determining the account identifier as the first account. Based on the above processing, the account id, the status, the at least one scene id, the at least one feature id, and the like of the first account can be obtained from the database.
The scene identifier may be an identifier of an application scene, for example, the scene identifier of the registration scene is scene a, and the scene identifier of the payment scene is scene B. The characteristic identifier can be a mobile phone number, a mailbox and other identifiers. Of course, the scene identification is not limited to the registration scene and the payment scene, and other application scenes are not limited. In addition, the feature identifier is not limited to the mobile phone number and the mailbox, and other feature identifiers are not limited.
For convenience of description, the first account number is "123123", the status is abnormal, the scene identifiers are scene a and scene B, the feature identifiers in scene a are "13810000000" and "123 @126. com", and the feature identifiers in scene B are "13810000001" and "123 @ qq.com" as an example.
In an example, the database for storing the user data may be a Distributed File System, for example, a Hadoop Distributed File System (HDFS), as shown in fig. 2A, the server may acquire data such as an account id, a state, a scene id, and a feature id from the Distributed File System, and the acquisition process is not limited, and may also acquire data from other types of databases.
In an example, the second account may be an account matched with the first account, or an account randomly selected by the server, or an arbitrary account indicated by the user, and a selection manner of the second account is not limited, and the example that the second account is an account matched with the first account is described later.
On this basis, the server may further determine, by using the scene identifier of the first account and the feature identifier of the first account, a second account that matches the first account. The scene identification of the second account is the same as the scene identification of the first account, and the feature identification of the second account is the same as the feature identification of the first account.
Based on the fact that the second account can be an account with an unknown state, after data such as account identification, state, scene identification, feature identification and the like are acquired from a database, if a certain account (for example, the account identification is 123456) is an account with an unknown state, the scene identification of the account is the same as the scene identification of the first account (123123), and the feature identification of the account is the same as the feature identification of the first account (123123), the account (123456) is determined to be the second account matched with the first account (123123).
In summary, the second account is 123456, the status is unknown, the subsequent process determines that the status of the second account is, the scene identifiers are scene a and scene B, the feature identifiers in the scene a are "13810000000" and "123 @126. com", and the feature identifiers in the scene B are "13810000001" and "123 @ qq.com".
And step 102 and the subsequent steps are executed on the assumption that the first account has a matched second account, and a new first account is selected on the assumption that the first account does not have a matched second account, and the process is continuously adopted. For convenience of description, the first account number 123123 and the second account number 123456 matching with the first account number are taken as examples.
In practical application, there may be multiple second account numbers matched with the first account number, and the processing procedure for each second account number is the same, and for convenience of description, a second account number is taken as an example to be described later.
For step 102, in an example, the process of "obtaining the scene weight corresponding to the scene identifier and the feature weight corresponding to the feature identifier" may include, but is not limited to, the following ways: the preset mapping relationship can be queried through the scene identifier and the feature identifier to obtain the scene weight corresponding to the scene identifier and obtain the feature weight corresponding to the feature identifier. The mapping relation is used for recording the corresponding relation between the scene identification and the scene weight and recording the corresponding relation between the feature identification and the feature weight.
As shown in table 1, the mapping relationship is an example, and in practical application, the mapping relationship is not limited to this. When the mapping relation is queried through the scenes A, 13810000000, 123@126.com, the scene weight is 0.8, the feature weight corresponding to 13810000000 is 0.7, and the feature weight corresponding to 123@126.com is 0.8. When the mapping relation is queried through the scenes B, 13810000001 and 123@ qq.com, the scene weight is 0.9, the characteristic weight corresponding to 13810000001 is 0.8, and the characteristic weight corresponding to 123@ qq.com is 0.9.
TABLE 1
For step 102, in an example, the process of "obtaining a feature value corresponding to the feature identifier" may include, but is not limited to, the following ways: if the characteristic mark is normal, setting a characteristic value corresponding to the characteristic mark as a first numerical value (such as 1, 0.95 and the like); if the feature identifier is abnormal, setting a feature value corresponding to the feature identifier as a second numerical value (such as 0, 0.05, etc.). For the feature identifier of the mobile phone number, if the mobile phone number is not 11 digits, such as 138100, the feature identifier is abnormal, and if the mobile phone number is 11 digits, whether the feature identifier is abnormal or not may be further analyzed, such as 12345612345, which is obviously abnormal, and the analysis process is not described again. For the characteristic identifier of the mailbox, whether the characteristic identifier is abnormal or not, such as 123@ qq.com, can be analyzed, which is obviously normal, and the analysis process is not described again.
With respect to step 103, in an example, the process of "obtaining the link strength of the first account and the second account by using the scene weight, the feature weight, and the feature value" includes, but is not limited to: processing the scene weight, the feature weight and the feature value according to a hyperbolic tangent function (tanh function) to obtain link strengths of the first account and the second account; or processing the scene weight, the feature weight and the feature value according to a sigmoid function (sigmoid function) to obtain the link strength of the first account and the second account; or processing the scene weight, the feature weight and the feature value according to a log function (log function) to obtain the link strengths of the first account and the second account.
In one example, the process of "processing the scene weight, the feature weight, and the feature value according to a hyperbolic tangent function (tanh function) to obtain the link strength of the first account and the second account" includes, but is not limited to: acquiring the link strength of the first account and the second account by using the following formula:
the process of processing the scene weight, the feature weight, and the feature value according to a sigmoid function (sigmoid function) to obtain the link strength of the first account and the second account may include, but is not limited to: acquiring the link strength of the first account and the second account by using the following formula:
for the process of processing the scene weight, the feature weight and the feature value according to a log function (log function) to obtain the link strength of the first account and the second account, the process may include, but is not limited to: acquiring the link strength of the first account and the second account by using the following formula:
in one example, ω is for each of the above equationsABRepresenting the link strength of the first account and the second account, tanh representing a hyperbolic tangent function, sigmoid representing a sigmoid growth curve function, log representing a logarithmic function, η representing a smoothing parameter, m representing the number of scenes, and SiRepresenting the scene weight, n, corresponding to the ith scene identityiRepresenting the number of features, theta, corresponding to the ith scene identityijRepresents the feature weight, x, corresponding to the jth feature identifier corresponding to the ith scene identifierijAnd representing the characteristic value corresponding to the jth characteristic identification corresponding to the ith scene identification.
The parameters are described below with reference to specific application scenarios. Eta represents a smoothing parameter and is used for controlling the steepness of the function, when the value of the smoothing parameter is smaller, the first account and the second account are easier to associate, when the value of the smoothing parameter is larger, the first account and the second account are less easy to associate, the smoothing parameter can be configured according to experience, if the smoothing parameter is 1, the configuration process is not repeated, and the configuration process is only required to be larger than 0.
In the application scenario, since the scenes are identified as scene a and scene B, the number of scenes m is 2. When i is 1, then siScene weight 0.8, n for scene AiThe corresponding feature number 2 for scenario a (13810000000 and 123@126. com). If j is 1, then θij13810000000 corresponding to a feature weight of 0.7, xij13810000000 (assuming a value of 1); if j is 2, then θijA characteristic weight of 0.8, x corresponding to 123@126.comijIs 123@126.com (e.g., value 1). When i is 2, siScene weight 0.9, n for scene BiThe corresponding feature number 2 for scenario B (i.e., 13810000000 and 123@126. com). If j is 1, then θij13810000000 corresponding to a feature weight of 0.8, xij13810000000 (value 1); if j is 2, then θijA characteristic weight of 0.9, x corresponding to 123@126.comijIs 123@126.com (value 1).
In one example, the link strength can be converted into a value within a preset interval based on the above hyperbolic tangent function (tanh function), sigmoid function (sigmoid function), and logarithmic function (log function), and the hyperbolic tangent function (tanh function), sigmoid function (sigmoid function), and logarithmic function (log function) are just some examples of the present application, and other functions may be used in practical applications as long as the link strength can be converted into a value within a preset interval, and are not limited thereto.
With respect to step 104, in one example, the process for "identifying the status of the second account by the status of the first account and the link strength" may include, but is not limited to, the following: if the state of the first account is abnormal and the link strength is greater than a preset first threshold, the state of the second account can be identified as abnormal; or, if the state of the first account is normal and the link strength is greater than a preset second threshold, it may be identified that the state of the second account is normal. The preset first threshold and the preset second threshold may be configured according to experience, and may be the same as or different from each other.
In addition, if the state of the first account is abnormal and the link strength is not greater than the preset first threshold, it may be recognized that the state of the second account is normal, or whether the state of the second account is abnormal is further analyzed by other methods, which is not described in detail again. If the state of the first account is normal and the link strength is not greater than the preset second threshold, it may be identified that the state of the second account is abnormal, or whether the state of the second account is abnormal is further analyzed in other manners, which is not described in detail herein.
For step 104, in another example, the process for "identifying the status of the second account by the status of the first account and the link strength" may include, but is not limited to, the following: and obtaining the confidence index of the second account by using the state of the first account and the link strength, and identifying the state of the second account by using the confidence index of the second account. Further, the confidence indicator may include, but is not limited to, a probability value that the second account is in a specified state, and the specified state may be abnormal or normal. Based on this, the process for "identifying the status of the second account with the confidence indicator of the second account" may include, but is not limited to: if the probability value is greater than a preset probability threshold, the state of the second account can be identified as a designated state. The preset probability threshold may be configured empirically. In addition, if the probability value is not greater than the preset probability threshold, it may be identified that the state of the second account is not the designated state, or whether the state of the second account is the designated state is further analyzed in other manners, which is not described in detail herein.
In one example, referring to fig. 2B, a MapReduce computing framework may be configured in the server, and the MapReduce computing framework performs the steps 101 to 103. Of course, other types of distributed batch processing frameworks, such as MPI computing framework, Spark computing framework, etc., may be configured in the server, and the distributed batch processing framework is not limited. After obtaining the link strength, the MapReduce calculation framework can also output the link strength to the distributed file system, and the distributed file system stores the link strength.
In one example, if the status of the second account is directly identified based on the status of the first account and the link strength, the above step 104 may be performed by the MapReduce computing framework. If the confidence index is determined based on the state and the link strength of the first account and the state of the second account is identified based on the confidence index, a Graph computing framework may be configured in the server, and the Graph computing framework performs the step 104.
Referring to fig. 2C, the Graph computing framework may acquire the link strength and the state of the first account from the distributed file system, obtain a confidence index of the second account based on the link strength and the state of the first account, and identify the state of the second account by using the confidence index. Referring to fig. 2D, the Graph computing framework may obtain the link strength and the state of the first account from the MapReduce computing framework, obtain a confidence index of the second account based on the link strength and the state of the first account, and identify the state of the second account using the confidence index.
Of course, in practical applications, the method is not limited to configuring the Graph computing framework in the server, and may also configure other types of distributed Graph computing frameworks, such as a Pregel computing framework, a Hama computing framework, and the like, in the server, and the distributed Graph computing framework is not limited to this. After the Graph computing framework obtains the confidence indexes, the confidence indexes can be output to the distributed file system, and the distributed file system stores the confidence indexes.
In one example, the process for "obtaining a confidence indicator for the second account using the status of the first account and the link strength" may include, but is not limited to, the following: the method comprises the steps that in the first mode, a first probability value is obtained according to a parameter average value and a parameter variance corresponding to the second account; then, a second probability value is obtained by using the state of the first account and the link strength, and a confidence index of the second account is obtained by using the first probability value and the second probability value. Or, in the second mode, a third probability value is obtained by using the state of the first account and the link strength, and then, a confidence index of the second account can be obtained by using the third probability value.
In one example, the process of obtaining the first probability value according to the parameter mean and the parameter variance corresponding to the second account may include: the method comprises the steps of obtaining the state of a second account and the service parameters of each account with the state, obtaining a parameter average value and a parameter variance (namely the parameter average value and the parameter variance corresponding to the second account) according to the service parameters, and obtaining a first probability value according to the parameter average value and the parameter variance.
In the process of "acquiring the state of the second account and the service parameters of each account having the state", since the second account has no corresponding state, a state (such as normal or abnormal) may be randomly generated for the second account. Then, the service parameters of a plurality of account numbers corresponding to the state (for example, all account numbers corresponding to the state, or a specified number of account numbers corresponding to the state, where the account numbers include a second account number) are obtained from a database (used for storing the service parameters of all account numbers), where the service parameters may be registration time, transaction amount, transaction time, transaction number, and the like, and the type of the service parameters is not limited. For convenience of description, it is assumed that 5 transaction amounts (e.g., 1000, 2000, 500, 900, 1100) and 5 transaction amounts (e.g., 3, 5, 2, 6, 4) are generated for a plurality of account numbers in the state, taking the transaction amount and the transaction amount as an example.
In one example, for the process of "obtaining the average value of the parameter according to the service parameter", the average value of the parameter of the service parameter can be obtained by using the following formula:
where μ is the parameter average, m is the number of traffic parameters, x
iIs the value of the ith service parameter, and l is the total number of the service parameters. For example, the average value of the traffic parameter (transaction amount) may be (1000+2000+500+900+1100)/5 ═ 1100, and the average value of the traffic parameter (transaction amount) may be (3+5+2+6+4)/5 ═ 4.
In one example, for the process of "obtaining parameter variance according to the service parameter", the parameter variance of the service parameter can be obtained by using the following formula:
wherein σ
2Is the variance of the parameter, mu is the mean value of the parameter, m is the number of traffic parameters, x
iIs the value of the ith service parameter, and l is the total number of the service parameters. For example, the parameter variance of the business parameter (transaction amount) may be (10000+810000+360000+40000+0)/5 ═ 244000, and the parameter variance of the business parameter (transaction amount) may be (1+1+4+4+0)/5 ═ 2.
Through the above processing, the following global variables can be defined: { α, β, γ, μ, σ
2Wherein α is
Beta is
Gamma is the total number of traffic parameters l, mu is the parameter mean, sigma
2Is the parameter variance. Moreover, in practical applications, the above global variables may be maintained for the business parameter (transaction amount) and the business parameter (transaction amount), respectively: { α, β, γ, μ, σ
2And fourthly, the maintenance process of the global variable is not described any more.
In one example, for the process of "obtaining the first probability value using the parameter mean and the parameter variance", the following may be included, but not limited to: the first probability value p (X | y) is obtained using the following formula.
In the above formula, K is the number of the service parameters, and for example, for the service parameters (transaction amount) and the service parameters (transaction number), the value of K is 2. When i is 1, the relevant data for the business parameter (transaction amount) is used, and when i is 2, the relevant data for the business parameter (transaction amount) is used.
The variance of the parameter, μ, when the status is y (i.e., the status of the second account number, e.g., 1)
i,yIs the average value of the parameter when the state is y, x
iIs the value of the ith service parameter of the second account. For example, when i is 1,
parameter variance 244000, μ for a traffic parameter (transaction amount)
i,yIs the parameter mean 1100, x of the traffic parameter (transaction amount)
iThe transaction amount of the second account number. When the value of i is 2, the ratio of i,
parameter variance of 2, mu as (transaction amount)
i,yParameter mean 4, x for the traffic parameter (number of transactions)
iIs the transaction amount for the second account number.
In one example, the process for "obtaining the second probability value by using the status and the link strength of the first account (i.e. the link strength between the first account and the second account)" may include, but is not limited to, the following: the second probability value is obtained using the following formula:
or,
in the above formula, Z may be a normalization function and may be configured empirically, and T is a temperature constant and may be configured empirically. t can be an annealing parameter, and the value of t is between 0 and 1, and can be configured according to experience.
The value of r is 1-M, and M represents the number of the first account numbers in the same state. For example, for a second account, if the number of first accounts matching the second account is 6, the statuses of 4 first accounts are normal, and the statuses of another 2 first accounts are abnormal, the link strength of the 4 first accounts and the second account may be used to participate in the calculation, for example, when r is 1, ω is ωrRepresents the link strength of the 1 st first account and the second account, and when r is 2, omegarRepresents the link strength of the 2 nd first account with the second account, and when r is 3, omegarRepresents the link strength of the 3 rd first account with the second account, and when r is 4, omegarIndicating the strength of the link between the 4 th first account and the second account, and since the status of the 4 first accounts is normal, the confidence indicator includes a specified status of normal. In addition, the abnormal link strength of 2 first account numbers and the second account number can be used for participating in the calculation, such as when r is 1, ωrRepresents the link strength of the 1 st first account and the second account, and when r is 2, omegarRepresents the 2 nd first account number and the second account numberThe link strength of the number, since the status of these 2 first account numbers is abnormal, the confidence indicator includes a specified status of abnormal.
In one example, for the process of "obtaining the confidence indicator of the second account by using the first probability value and the second probability value", the confidence indicator of the second account may include a probability value that the second account is in a specified state, and thus the specified state and the probability value need to be determined. For the designated state included in the confidence index, determining according to the second probability value, that the designated state included in the confidence index is the normal state, for example, when the second probability value is obtained by using the link strength of the first account and the second account in the normal state; and when the second probability value is obtained by using the link strength of the first account and the second account in the abnormal state, the designated state included by the confidence index is the abnormal state. For the probability value included by the confidence indicator, the probability value may be a product of the first probability value and the second probability value.
The above process is a process of obtaining the confidence index of the second account in the first mode. In addition, in the process of obtaining the confidence index of the second account in the second mode, for the process of "obtaining the third probability value by using the state and the link strength of the first account", the process is similar to the process of "obtaining the second probability value by using the state and the link strength of the first account", except that the second probability value is changed into the third probability value, and details are not repeated here.
For the process of "obtaining the confidence index of the second account by using the third probability value", the confidence index of the second account may include a probability value that the second account is in a designated state, and for the designated state included in the confidence index, the confidence index may be determined according to the third probability value, for example, when the third probability value is obtained by using the link strength of the first account and the second account in a normal state, the designated state included in the confidence index is in the normal state; and when the third probability value is obtained by using the link strength of the first account and the second account in the abnormal state, the designated state included by the confidence index is the abnormal state. A probability value included for the confidence indicator, that is, the third probability value.
In the above process, the account status is taken as normal or abnormal as an example, and in practical applications, normal may be represented by 0, and abnormal is represented by 1, that is, the account status is 0 or 1, and in addition, the account status may also be other values between 0 and 1, such as 0.1, 0.2, 0.9, etc., 0.1 and 0.2 represent that the account status is closer to the normal status, and 0.9 represents that the account status is closer to the abnormal status.
In combination with a specific application scenario, a first probability value is obtained for the "parameter average value and parameter variance corresponding to the second account; the process of obtaining the second probability value by using the state and the link strength of the first account, and obtaining the confidence index of the second account by using the first probability value and the second probability value is explained.
According to bayesian theorem (describing the probability of occurrence of event Y in the case of occurrence of event X), the probability value p (Y | X) included in the confidence index can be obtained by using the following formula:
based on this, y corresponding to the maximum value of p (y | X) is required as the state of the second account, which is equivalent to finding the maximum value of p (X | y) p (y). Therefore, in order to obtain the maximum value of p (X | y) p (y), p (X) may be 1, that is, p (y | X) is p (X | y) p (y).
Wherein the first probability value p (X | y) is a likelihood function of the feature X, and the first probability value p (X | y) can be obtained based on the following formula according to the normal distribution:
of course, in practical applications, the likelihood function of other distribution types may also be used, which is not limited in this respect.
Wherein the second probability value p (y) is a prior probability of the random field y, and the second probability value p (y) can be obtained based on the following formula according to the Gibbs distribution:
further, based on the annealing algorithm, the second probability value p (y) may also be obtained based on the following formula:
of course, in practical application, other formulas can be used, and this is not limited. In the above-mentioned formula,
is a normalization function and u (x) is referred to as an energy function and T is a temperature constant.
In one example, the confidence indicator for the second account may be obtained in an iterative manner. For example, for the process of "acquiring the state of the second account", a state 1 (e.g. 0) may be generated for the second account, and the above-mentioned processing is performed for the state 1 to obtain a confidence index 1, then, it is determined whether an iteration ending policy is satisfied (e.g. whether the iteration number reaches the maximum iteration number, or whether a probability value included in the confidence index is greater than a preset threshold, etc.), if not, the process of "acquiring the state of the second account" is returned, a new state 2 (e.g. 0.1) is generated for the second account, and the above-mentioned processing is performed for the state 2 to obtain a confidence index 2, then, it is determined whether the iteration ending policy is satisfied, if not, the process of "acquiring the state of the second account" is returned, a new state 3 (e.g. 0.2) is generated for the second account, and the above-mentioned processing is performed for the state 3 to obtain a confidence index 3, and so on until the iteration ending strategy is satisfied.
Further, after the iteration ending strategy is satisfied, the confidence index with the highest probability value may be selected from all the confidence indexes, and the confidence index with the highest probability value may be determined as the confidence index of the second account.
Based on the above technical scheme, in the embodiment of the application, the link strength of the first account and the second account can be obtained by using the scene weight corresponding to the scene identifier, the feature weight corresponding to the feature identifier, and the feature value corresponding to the feature identifier, and the state of the second account is identified by using the state of the first account and the link strength. Therefore, the account state can be effectively identified based on the link strength, namely, the account with the abnormal condition can be effectively identified, and the identification success rate is high. Moreover, when one account is abnormal, other accounts related to the account can be accurately found, and the business behaviors of other accounts are comprehensively and effectively monitored, so that the coverage of identifying the abnormal account is expanded, and the success rate of identifying the abnormal account is improved.
Moreover, the confidence index of the second account can be further analyzed by using the state and the link strength of the first account, and the state of the second account is identified based on the confidence index of the second account, so that the identification accuracy of the abnormal account is further improved, the abnormal account can be accurately identified, and the identification success rate is high.
The technical scheme of the embodiment of the application can be applied to transaction service scenes of electronic commerce websites and other scenes related to link strength, such as junk mails, false comments, fund networks, social networks and the like, as long as data of the service scenes can be acquired, and the application scene is not limited.
Based on the same application concept as the method, the embodiment of the application also provides an account state identification method, which may include the following steps: acquiring at least one scene identifier and at least one feature identifier corresponding to a first account in an abnormal state; acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier; and determining the abnormity of the second account by using the scene weight, the characteristic weight and the characteristic value. Further, the process of determining an anomaly of the second account using the scene weight, the feature weight, and the feature value may include: acquiring the link strength of the first account and the second account by using the scene weight, the feature weight and the feature value; and if the link strength is greater than a preset threshold value, identifying that the state of the second account is abnormal.
Compared with the flow shown in fig. 1, the account status identification method is different in that:
the first account is an account with an abnormal state, and therefore, when the link strength is greater than a preset threshold value, the state of the second account can be directly identified as abnormal. Other features are similar to the flow of fig. 1 and are not described again.
Based on the same application concept as the method described above, referring to fig. 3, which is another flowchart of an account status identification method in the embodiment of the present application, the account status identification method may include the following steps:
step 301, acquiring the state of the first account and the link strength of the first account and the second account.
Step 302, a confidence index of the second account is obtained by using the status of the first account and the link strength.
Step 303, identifying the status of the second account by using the confidence index of the second account.
In an example, the execution sequence is only an example given for convenience of description, and in practical applications, the execution sequence between the steps may also be changed, and the execution sequence is not limited.
In an example, for the process of "acquiring the state of the first account and the link strength of the first account and the second account", the processing procedures in steps 101 to 103 may be referred to, and are not described herein again. Of course, the link strength of the first account and the second account may also be obtained in other manners, which is not limited to this.
In one example, the process for "obtaining a confidence indicator for a second account using the status of the first account and the strength of the link" may include, but is not limited to, the following: the method comprises the steps that in the first mode, a first probability value is obtained according to a parameter average value and a parameter variance corresponding to the second account; then, a second probability value is obtained by using the state of the first account and the link strength, and a confidence index of the second account is obtained by using the first probability value and the second probability value. Or, in the second mode, a third probability value is obtained by using the state of the first account and the link strength, and then, a confidence index of the second account can be obtained by using the third probability value.
In one example, the process of obtaining the first probability value according to the parameter mean and the parameter variance corresponding to the second account may include: the method comprises the steps of obtaining the state of a second account and the service parameters of each account with the state, obtaining a parameter average value and a parameter variance (namely the parameter average value and the parameter variance corresponding to the second account) according to the service parameters, and obtaining a first probability value according to the parameter average value and the parameter variance.
In one example, the confidence indicator may include, but is not limited to, a probability value that the second account number is in a specified state, and the specified state may be abnormal or normal. Based on this, the process of "identifying the status of the second account with the confidence indicator of the second account" may include, but is not limited to, the following: if the probability value is greater than a preset probability threshold, the state of the second account can be identified as a designated state. Wherein the preset probability threshold may be configured empirically. In addition, if the probability value is not greater than the preset probability threshold, it may be identified that the state of the second account is not the designated state, or whether the state of the second account is the designated state is further analyzed in other manners, which is not described in detail herein.
For the detailed flows of step 302 and step 303, reference may be made to the flow of fig. 1, which is not described again.
Based on the same application concept as the method described above, an embodiment of the present application further provides an account status recognition apparatus, as shown in fig. 4, which is a structural diagram of the account status recognition apparatus, and the apparatus includes: a first obtaining module 401, configured to obtain at least one scene identifier and at least one feature identifier corresponding to a first account; a second obtaining module 402, configured to obtain a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier, and a feature value corresponding to the feature identifier; a third obtaining module 403, configured to obtain link strengths of the first account and the second account by using the scene weight, the feature weight, and the feature value; the identifying module 404 is configured to identify the status of the second account according to the status and the link strength of the first account.
In an example, the second obtaining module 402 is specifically configured to, in a process of obtaining a scene weight corresponding to the scene identifier and a feature weight corresponding to the feature identifier, obtain the scene weight corresponding to the scene identifier and the feature weight corresponding to the feature identifier by querying a mapping relationship through the scene identifier and the feature identifier; the mapping relation is used for recording the corresponding relation between the scene identification and the scene weight and the corresponding relation between the feature identification and the feature weight; in the process of obtaining the characteristic value corresponding to the characteristic identifier, if the characteristic identifier is normal, setting the characteristic value corresponding to the characteristic identifier as a first numerical value; and if the characteristic mark is abnormal, setting the characteristic value corresponding to the characteristic mark as a second numerical value.
The third obtaining module 403 is specifically configured to, in a process of obtaining the link strengths of the first account and the second account by using the scene weight, the feature weight, and the feature value, process the scene weight, the feature weight, and the feature value according to a hyperbolic tangent function to obtain the link strengths of the first account and the second account; or processing the scene weight, the feature weight and the feature value according to an S-shaped growth curve function to obtain the link strength of the first account and the second account; or processing the scene weight, the feature weight and the feature value according to a logarithmic function to obtain the link strength of the first account and the second account;
the third obtaining module 403 is specifically configured to, in the process of processing the scene weight, the feature weight, and the feature value according to a hyperbolic tangent function and obtaining the link strength of the first account and the second account, obtain the link strength of the first account and the second account by using the following formula:
ωABrepresenting the link strength of the first account number and the second account number, tanh representing a hyperbolic tangent function, and η representing a hyperbolic tangent functionSmoothing parameter, m denotes the number of scenes, siRepresenting the scene weight, n, corresponding to the ith scene identityiRepresenting the number of features, theta, corresponding to the ith scene identityijRepresents the feature weight, x, corresponding to the jth feature identifier corresponding to the ith scene identifierijAnd representing the characteristic value corresponding to the jth characteristic identification corresponding to the ith scene identification.
The identifying module 404 is specifically configured to, in the process of identifying the state of the second account through the state of the first account and the link strength, identify that the state of the second account is abnormal if the state of the first account is abnormal and the link strength is greater than a preset first threshold; if the state of the first account is normal and the link strength is greater than a preset second threshold, identifying that the state of the second account is normal; or obtaining a confidence index of the second account by using the state of the first account and the link strength, and identifying the state of the second account by using the confidence index of the second account.
The identifying module 404 is specifically configured to, in a process of obtaining a confidence indicator of the second account by using the state of the first account and the link strength, obtain a first probability value according to a parameter average value and a parameter variance corresponding to the second account, obtain a second probability value by using the state of the first account and the link strength, and obtain the confidence indicator of the second account by using the first probability value and the second probability value; or obtaining a third probability value by using the state of the first account and the link strength, and obtaining a confidence index of the second account by using the third probability value; the confidence index comprises a probability value that the second account is in an appointed state, wherein the appointed state is abnormal or normal;
the identifying module 404 is specifically configured to, in the process of identifying the state of the second account by using the confidence indicator, identify that the state of the second account is the designated state if the probability value is greater than a preset probability threshold.
Similar to the above account status recognition apparatus, an embodiment of the present application further provides another account status recognition apparatus, where the apparatus includes: the first acquisition module is used for acquiring at least one scene identifier and at least one feature identifier corresponding to the first account with the abnormal state; a second obtaining module, configured to obtain a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier, and a feature value corresponding to the feature identifier; and the determining module is used for determining the abnormity of the second account by using the scene weight, the characteristic weight and the characteristic value. Further, the determining module is specifically configured to obtain the link strength of the first account and the second account by using the scene weight, the feature weight, and the feature value. And when the link strength is greater than a preset threshold value, identifying that the state of the second account is abnormal.
An embodiment of the present application further provides another account status identification device, as shown in fig. 5, which is a structure diagram of the account status identification device, where the device includes: a first obtaining module 501, configured to obtain a state of a first account and link strength of the first account and a second account; a second obtaining module 502, configured to obtain a confidence indicator of the second account by using the state of the first account and the link strength; the identifying module 503 is configured to identify the status of the second account by using the confidence index of the second account.
The second obtaining module 502 is specifically configured to, in a process of obtaining a confidence indicator of the second account by using the state of the first account and the link strength, obtain a first probability value according to a parameter average value and a parameter variance corresponding to the second account, obtain a second probability value by using the state of the first account and the link strength, and obtain the confidence indicator of the second account by using the first probability value and the second probability value; or obtaining a third probability value by using the state of the first account and the link strength, and obtaining a confidence index of the second account by using the third probability value; the confidence index comprises a probability value that the second account is in an appointed state, wherein the appointed state is abnormal or normal;
the identifying module 503 is specifically configured to identify, in the process of identifying the state of the second account by using the confidence indicator, that the state of the second account is the designated state if the probability value is greater than a preset probability threshold.
Based on the same application concept as the method, the embodiment of the present application further provides a server, where the server includes a processor and a memory: the processor is used for acquiring at least one scene identifier and at least one feature identifier corresponding to the first account; acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier; acquiring the link strength of the first account and the second account by using the scene weight, the feature weight and the feature value; and identifying the state of the second account through the state of the first account and the link strength.
The embodiment of the application also provides another server, which comprises a processor and a memory, wherein the processor is used for: the processor is used for acquiring at least one scene identifier and at least one feature identifier corresponding to the first account with the abnormal state; acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier; and determining the abnormity of the second account by using the scene weight, the characteristic weight and the characteristic value.
The embodiment of the application also provides another server, which comprises a processor and a memory, wherein the processor is used for: the processor is used for acquiring the state of a first account and the link strength of the first account and a second account; obtaining a confidence index of the second account by using the state of the first account and the link strength; and identifying the state of the second account by using the confidence index of the second account.
Based on the same application concept as the method, the embodiment of the present application further provides a machine-readable storage medium, which can be applied to a server, where the machine-readable storage medium stores thereon several computer instructions, and the computer instructions, when executed, perform the following processes: acquiring at least one scene identifier and at least one characteristic identifier corresponding to a first account; acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier; acquiring the link strength of the first account and the second account by using the scene weight, the feature weight and the feature value; and identifying the state of the second account through the state of the first account and the link strength.
Another machine-readable storage medium, which may be applied to a server, is provided in an embodiment of the present application, where the machine-readable storage medium stores thereon several computer instructions, and when executed, the computer instructions perform the following processes: acquiring at least one scene identifier and at least one feature identifier corresponding to a first account in an abnormal state; acquiring a scene weight corresponding to the scene identifier, a feature weight corresponding to the feature identifier and a feature value corresponding to the feature identifier; and determining the abnormity of the second account by using the scene weight, the characteristic weight and the characteristic value.
Another machine-readable storage medium, which may be applied to a server, is provided in an embodiment of the present application, where the machine-readable storage medium stores thereon several computer instructions, and when executed, the computer instructions perform the following processes: acquiring the state of a first account and the link strength of the first account and a second account; obtaining a confidence index of the second account by using the state of the first account and the link strength; and identifying the state of the second account by using the confidence index of the second account.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but is not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.