CN109993538A - Identity theft detection method based on probability graph model - Google Patents
Identity theft detection method based on probability graph model Download PDFInfo
- Publication number
- CN109993538A CN109993538A CN201910148549.8A CN201910148549A CN109993538A CN 109993538 A CN109993538 A CN 109993538A CN 201910148549 A CN201910148549 A CN 201910148549A CN 109993538 A CN109993538 A CN 109993538A
- Authority
- CN
- China
- Prior art keywords
- probability
- graph model
- formula
- feature
- network payment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/382—Payment protocols; Details thereof insuring higher security of transaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
- G06Q20/4014—Identity check for transactions
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Computer Security & Cryptography (AREA)
- Finance (AREA)
- Strategic Management (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention provides a kind of identity theft detection method based on probability graph model, comprising steps of S1: collecting and obtains and pre-process network payment transaction data, obtains a network payment transaction feature set;S2: it is established using the network payment transaction feature set and obtains a probability graph model;S3: inputting the parameter of a training set and the training probability graph model, while the conditional probability parameter of the probability graph model is obtained using Bayes' theorem;S4: predicting a forecast set of input using the conditional probability parameter and the Bayes' theorem, obtains a prediction result.A kind of identity theft detection method based on probability graph model of the invention, based on probability graph model, by synthesizing behavior and model attributes realization network payment fraud detection to user, dynamic on-line tuning can be carried out to detection model, improve the robustness of the accuracy and model that intercept fraudulent trading.
Description
Technical field
The present invention relates to the anti-fraud detection fields of internet banking network payment, more particularly to one kind to be based on probability artwork
The identity theft detection method of type.
Background technique
Mobile Internet is a handle double-edged sword, is consequently also brought while bringing convenience to people's lives various hidden
Suffer from, for example, online trading payment platform can allow people stays indoors in addition can be carried out doing shopping anywhere or anytime and prop up
It pays, but this convenience and fast some illegal attackers is also allowed to have an opportunity to take advantage of, attacker is by stealing the account letter of user
Breath, steals the individual privacy information of user, or even the user itself that disguises oneself as is traded or transferred accounts to complete to cheat.Therefore it is
The effective individual interest safety for ensureing user and company, needs to establish effective network payment fraud detection system
System.
Some network payment fraud models based on machine learning even deep learning are currently existed, wherein absolutely mostly
Several learning models is the discrimination model based on expectation maximization, cheats model for the network on line is counter, uses deep learning
Although equal models can be in effect better than other methods, deep learning model as the anti-method cheated of network payment
One typical black-box model, result do not have it is explanatory, do not have enough convincingnesses.
Summary of the invention
In view of the deficiency of the prior art, the present invention provides a kind of identity theft detection side based on probability graph model
Method realizes network payment fraud detection by synthesizing behavior modeling to user based on probability graph model, can be to detection model
Dynamic on-line tuning is carried out, the robustness of the accuracy and model that intercept fraudulent trading is improved.
To achieve the goals above, the present invention provides a kind of identity theft detection method based on probability graph model, including
Step:
S1: collecting and obtain and pre-process network payment transaction data, obtains a network payment transaction feature set;
S2: it is established using the network payment transaction feature set and obtains a probability graph model;
S3: the parameter of one training set of input and the training probability graph model, while using described in Bayes' theorem acquisition
The conditional probability parameter of probability graph model;
S4: predicting a forecast set of input using the conditional probability parameter and the Bayes' theorem, obtains
One prediction result.
Preferably, the S1 step further comprises step:
S11: data scrubbing step, by the network payment transaction data fill in missing values, smooth noise and
Identification solves that data are inconsistent to realize that the clear of error correcting and repeated data is removed in the formattings of data, abnormal data
It removes;
S12: the unified storage of the network payment transaction data of multiple data sources is formed a number by data integration step
According to library;
S13: the network payment transaction data standardization processing in the database is formed into the network payment and is handed over
Easy characteristic set.
Preferably, the S2 step further comprises step:
S21: the network payment transaction feature set θ, one candidate feature set θ ' of input, a set of relationship R, mark are obtained
Sign attribute Y and threshold value λ;Wherein, θ ' ∈ Φ, R ∈ Φ, Φ indicate empty set.
S22: the feature X for obtaining the network payment transaction feature set θ is calculated according to formula (1)iWith tag attributes Y's
Mutual information I:
Wherein, XiIndicate ith feature;I is the natural number more than or equal to 1;Y indicates tag attributes;X indicates XiValue;
The value of y expression Y;The joint probability of p (x, y) expression x and y;P (x) is the marginal probability of x;P (y) is the marginal probability of y;I
(Xi;Y X) is indicatediMutual information between Y;
S22: judge I (Xi;Y) whether it is more than or equal to preset threshold value λ;Such as it is to continue with subsequent step;
S23: the candidate feature set θ ' is updated according to formula (2):
θ ' :=θ '+Xi(2);
The network payment transaction feature set θ is updated according to formula (3):
θ :=θ-Xi(3);
S24: according to obtaining dependence r, r:Xi→Y;
S25: the set of relationship R is updated according to formula (4);
S26: judge whether the feature quantity in presently described candidate feature set θ ' is more than or equal to 2;It is such as to continue with subsequent
Step, otherwise return step S23;
S27: the mutual information between feature two-by-two is calculated in presently described candidate feature set θ ' according to formula (5):
Wherein, XiIndicate the i-th feature in θ ', XjIndicate the jth feature in θ ', i, j are greater than the natural number equal to 1;x
Indicate XiValue;X ' expression XjValue;The joint probability of p (x, x ') expression x and x ';P (x) is the marginal probability of x;p(x′)
For the marginal probability of x ';I(Xi;Xj) indicate XiWith XjBetween mutual information;The set of relationship R is updated by formula (4);
S28: current θ ' is assigned to θ, and empties set θ ';By between formula (5) set of computations θ two-by-two feature
Mutual information, if I (Xi;Xj) >=λ determines the dependence r between feature two-by-two, is passing through formula then according to priori knowledge
(4) the set of relationship R is updated;
S29: repeating step S28 until θ is the I (X of empty or all featuresi;Xj)≤λ, at this time according to presently described set of relations
It closes R and obtains the probability graph model.
Preferably, the S3 step further comprises step:
S31: one training set of input, the training set includes characteristic attribute and tag attributes;
S32: it is calculated according to formula (6) and obtains the conditional probability parameter:
Wherein, AiIndicate the i-th father node of the probability graph model;B indicates AiChild node;ptrain(Ai| B) indicate Ai
Conditional probability parameter between B;p(Ai) indicate AiMarginal probability;p(B|Ai) expression condition be AiIt is the probability that B occurs;Aj
Indicate jth father node;p(Aj) indicate AjMarginal probability;P(B|Aj) expression condition be AjWhen B occur probability;
S33: whether judgment formula (6) restrains, and is such as to continue with subsequent step, otherwise return step S31.
Preferably, the S4 step further comprises step:
S41: one test set of input, the test set includes characteristic attribute Y ';
S42: calculating according to a formula (7) and obtain a posterior probability, exports the prediction result according to the posterior probability;
Wherein, p (Y ' | X1,…,Xn) expression condition be X1,…,XnWhen Y ' generation probability;P(X1,…,Xn| Y ') it indicates
X when condition is Y '1,…,Xn.Joint probability;The marginal probability of P (Y ') expression Y ';P(X1,…,Xn) indicate X1,…,XnConnection
Close probability.
Preferably, it is further comprised the steps of: after the S4 step
S5: the prediction result is verified.
Preferably, the S5 step further comprises step:
S51: according to the prediction result count obtain formula (7) model one by positive class determine be positive class quantity TP,
One by negative class determine to be positive the quantity FP of class, one negative class is determined into the number of class of being negative by positive class determine the to be negative quantity FN and one of class
Measure TN;
S52: it is calculated according to a formula (8) and obtains an accurate rate precision:
It is calculated according to a formula (9) and obtains a recall rate recall:
Acquisition one, which is calculated, according to a formula (10) bothers rate disturb:
S53: it according to the accurate rate, the recall rate and described bother rate and evaluates the prediction result.
The present invention due to use above technical scheme, make it have it is following the utility model has the advantages that
Often there is based on Bayesian probability graph model when giving a forecast to data very strong interpretation and say
Take power;Probability graph model carrys out training pattern using training set, obtains conditional probability parameter, when giving a forecast to test set, utilizes
Priori knowledge and the condition of test set obtain conditional probability and finally derive that posterior probability, result have very strong convincingness;
And probability graph model is capable of handling the situation there are hidden variable.The interpretable of model is improved based on probability graph model
Property, to detection fraudulent trading, intercepts fraudulent trading and the fund security of user and enterprise is protected to have better guarantee.
Detailed description of the invention
Fig. 1 is the overview flow chart of the identity theft detection method based on probability graph model of the embodiment of the present invention;
Fig. 2 is that the bank data that is directed to of the embodiment of the present invention models to obtain probability graph model;
Fig. 3 is the part detailed process signal of the identity theft detection method based on probability graph model of the embodiment of the present invention
Figure.
Specific embodiment
Below according to attached FIG. 1 to FIG. 3, presently preferred embodiments of the present invention is provided, and is described in detail, is enabled more preferable geographical
Solve function of the invention, feature.
Please refer to FIG. 1 to FIG. 3, a kind of identity theft detection method based on probability graph model of the embodiment of the present invention, packet
Include step:
S1: collecting and obtain and pre-process network payment transaction data, obtains a network payment transaction feature set.
Wherein, S1 step further comprises step:
S11: data scrubbing step fills in missing values, smooth noise and identification by carrying out to network payment transaction data
Solve inconsistent formatting, the removing error correcting of abnormal data and the removing of repeated data to realize data of data;
S12: the unified storage of the network payment transaction data of multiple data sources is formed a database by data integration step;
S13: the network payment transaction data standardization processing in database is formed into network payment transaction feature set.
Although current internet finance has produced many transaction data abundant, based in the real world
Data are generally all incomplete inconsistent dirty datas, can not directly participate in the calculating of model, it is therefore necessary to original
Data are pre-processed.(1) data scrubbing: by filling in missing values, smooth noise data identifies or solves inconsistent clear up
Data.Mainly reach target below: the formatting standard (such as time) of data, the removing of abnormal data, error correcting,
The removing of repeated data;(2) data integration: the data in multiple data sources are mainly combined and are uniformly deposited by data integration
Storage, establishes data warehouse;(3) data convert: by smoothly assembling, Data generalization, the modes such as standardization convert the data into
Practise the form that model needs.
Such as: type is as shown in table 1 after the original field and pretreatment of data.
Type list after the original field of table 1 and pretreatment
Field name | Data type | Field description | Type after pretreatment |
Transaction_Time | Character string | The incident time is handed over, second grade is accurate to | Integer |
Check | Character string | The sign test mode of transaction | Integer |
Transaction_Type | Character string | The type of transaction | Integer |
Transaction_Amount | Floating type | Transaction amount, unit RMB | Integer |
Merchant_Code | Character string | The merchant number of transaction | Integer |
IP | Character string | Transaction whether common IP | Integer |
Sign | Character string | The label of transaction | Integer |
Available original field is largely character string type as can be seen from Table 1, and as probability graph model itself
The variable of discrete type can only then be processed, therefore pre-processing not only includes data scrubbing and data integration, and is become in data
During changing, continuous type floating number is also converted into the computable discrete variable of probability graph model.
S2: it is established using network payment transaction feature set and obtains a probability graph model.
By the dependence and independence between analysis feature, a complete probability graph is constructed.Constructing probability graph is then
The Joint Distribution between data characteristics is constructed, and dependence and independence are two main characters of distribution.Independence property
It is extremely important when answering inquiry, it can be used to fundamentally reduce the calculating cost of deduction.
In the present embodiment, the algorithm environment of this step is based on: Python and Numpy system.
Wherein, S2 step further comprises step:
S21: obtaining network payment transaction feature set θ, inputs a candidate feature set θ ', a set of relationship R, label category
Property Y and threshold value λ;Wherein, θ ' ∈ Φ, R ∈ Φ, Φ indicate empty set.
S22: the feature X for obtaining network payment transaction feature set θ is calculated according to formula (1)iWith the mutual trust of tag attributes Y
Breath amount I:
Wherein, XiIndicate ith feature;I is the natural number more than or equal to 1;Y indicates tag attributes;X indicates XiValue;
The value of y expression Y;The joint probability of p (x, y) expression x and y;P (x) is the marginal probability of x;P (y) is the marginal probability of y;I
(Xi;Y X) is indicatediMutual information between Y;
S22: judge I (Xi;Y) whether it is more than or equal to preset threshold value λ;Such as it is to continue with subsequent step;
S23: candidate feature set θ ' is updated according to formula (2):
θ ' :=θ '+Xi(2);
Network payment transaction feature set θ is updated according to formula (3):
θ :=θ-Xi(3);
S24: according to obtaining dependence r, r:Xi→Y;
S25: set of relationship R is updated according to formula (4);
S26: judge whether the feature quantity in current candidate characteristic set θ ' is more than or equal to 2;It is such as to continue with subsequent step,
Otherwise return step S23;
S27: the mutual information between feature two-by-two is calculated in current candidate characteristic set θ ' according to formula (5):
Wherein, XiIndicate the i-th feature in θ ', XjIndicate the jth feature in θ ', i, j are greater than the natural number equal to 1;x
Indicate XiValue;X ' expression XjValue;The joint probability of p (x, x ') expression x and x ';P (x) is the marginal probability of x;p(x′)
For the marginal probability of x ';I(Xi;Xj) indicate XiWith XjBetween mutual information;Set of relationship R is updated by formula (4);
S28: current θ ' is assigned to θ, and empties set θ ';By between formula (5) set of computations θ two-by-two feature
Mutual information, if I (Xi;Xj) >=λ determines the dependence r between feature two-by-two, is passing through formula then according to priori knowledge
(4) set of relationship R is updated;
S29: repeating step S28 until θ is the I (X of empty or all featuresi;Xj)≤λ, at this time according to current relation set R
Obtain probability graph model.
S3: the parameter of one training set of input and training probability graph model, while probability artwork is obtained using Bayes' theorem
The conditional probability parameter of type.
The main function of this step is the parameter in training pattern.The essence of probability graph model training is exactly to pass through statistics instruction
Practice the marginal probability of each of collection feature, and in this, as condition, by calculating the joint probability of feature, i.e. posterior probability
As condition, go to infer the conditional probability in probability graph, the i.e. parameter of model using Bayes' theorem.
In the present embodiment, the algorithm environment of this step is based on: Python, Pgmpy probability graph model and Pandas number
According to analysis tool.
Wherein, S3 step further comprises step:
S31: one training set of input, training set includes characteristic attribute and tag attributes;
S32: it is calculated according to formula (6) and obtains conditional probability parameter:
Wherein, AiIndicate the i-th father node of probability graph model;B indicates AiChild node;ptrain(Ai| B) indicate AiWith B it
Between conditional probability parameter;p(Ai) indicate AiMarginal probability;p(B|Ai) expression condition be AiWhen B occur probability;AjIt indicates
Jth father node;p(Aj) indicate AjMarginal probability;P(B|Aj) expression condition be AjWhen B occur probability;
S33: whether judgment formula (6) restrains, and is such as to continue with subsequent step, otherwise return step S31.
S4: predicting a forecast set of input using conditional probability parameter and Bayes' theorem, obtains a prediction knot
Fruit.
The main function of this step is judged to unknown record, that is, is directed to a real-time transaction record, model
A prediction result is provided, that is, judges that the transaction is arm's length dealing either fraudulent trading.And the process predicted mainly is also
With Bayes' theorem, i.e., using the feature in transaction record as condition, with the conditional probability in model, with Bayes' theorem
It goes to infer the posterior probability that this records.
In the present embodiment, the algorithm environment of this step is based on: Python, Pgmpy probability graph model, Pandas data
Analysis tool and Numpy system.
Wherein, S4 step further comprises step:
S41: one test set of input, test set includes characteristic attribute Y ';
S42: it is done using Bayesian network and infers to be exactly in the conditional probability obtained using training process and test set
Condition derive posterior probability;It is calculated according to a formula (7) and obtains posterior probability, prediction result is exported according to posterior probability;
Wherein, p (Y ' | X1,…,Xn) expression condition be X1,…,XnWhen Y ' generation probability;P(X1,…,Xn| Y ') it indicates
X when condition is Y '1,…,XnJoint probability;The marginal probability of P (Y ') expression Y ';P(X1,…,Xn) indicate X1,…,XnConnection
Close probability.
S5: prediction result is verified.
Wherein, S5 step further comprises step:
S51: according to prediction result count obtain formula (7) model one by positive class determine be positive class quantity TP, one will
Negative class determine to be positive the quantity FP of class, one positive class determine the to be negative quantity FN and one of class is determined that negative class be negative the quantity of class
TN;
S52: it is calculated according to a formula (8) and obtains an accurate rate precision:
It is calculated according to a formula (9) and obtains a recall rate recall:
Acquisition one, which is calculated, according to a formula (10) bothers rate disturb:
S53: according to accurate rate, recall rate and rate is bothered come evaluation and foreca result.
For example, being obtained by carrying out detection proof on true internet Bank Danamon transaction data collection in the rate of bothering
(disturb) less than 1%, 0.5%, 0.1% and 0.05% the recall rate (interception rate, True Positive Rate) when, and
Thus the performance of this method is evaluated, the method for the present embodiment herein means to put on and calculate is better than previous research on the time,
And there is good robustness.
The probability graph model in Fig. 2 is please referred to, in actual use, the method for the present embodiment features disappearing for different user
Take the joint ensemble between mode and different characteristic, users different first is when bank handles bank card, the work of the card
A kind of purposes (as specially used the card as speculation in stocks or wage card) of fixation, therefore the bank of different purposes can be presented when using possible
Card may show different sign test modes, if the bank card of some user is used to carry out particular transaction (as speculated in shares),
Then relatively fixed normality (such as with the opening quotation of stock market and close disk time correlation) can be presented in the exchange hour of the card;And it should
The transaction amount of card can show relatively high correlation (related to the price of stock);The trade company to trade simultaneously with the card
Side can also show relatively high correlation (such as certain specific companies);It whether is that common IP also embodies during then trading
The stationary distribution that user is formed when trading out is related.The behavior point of different user is constituted without the consumption habit of user
Cloth, if once appearance and the unmatched behavior pattern of transaction before, has very maximum probability that can be judged as fraudulent trading.
Here it is interpretation logic, the method for the present embodiment compared to traditional deep learning model black box, by combine with
The relevant knowledge of banking is directed to similar user in conjunction with hypothesis, and building is used to portray the probability artwork of user behavior distribution
Type, and the model has extraordinary interpretation logic.
In addition, being prediction model using probability graph model, the situation there are hidden variable can be preferably handled, this is base
The a priori assumption of a routine can be provided by professional knowledge in probability graph model, i.e., when model itself has non-observational variable
When, then using Bayesian Estimation can provide a kind of reasonable estimation by state-space model so that method have it is more preferable
Robustness.
A kind of identity theft detection method based on probability graph model of the embodiment of the present invention, based on Bayesian general
Rate graph model often has very strong interpretation and convincingness when giving a forecast to data;Probability graph model uses training set
Training pattern obtains conditional probability parameter, when giving a forecast to test set, obtains item using the condition of prior probability and test set
Part probability finally derives that posterior probability, result have very strong convincingness;And probability graph model is capable of handling that there are hidden
The situation of variable, and these are that the existing method based on discrimination model can not accomplish;Therefore the embodiment of the present invention
The identity theft detection method based on probability graph model based on probability graph model has not available for existing discrimination model
Advantage.Deficiency which overcome tradition based on deep learning as fraud detection method improves the interpretation of model, right
Detection fraudulent trading intercepts fraudulent trading and the fund security of user and enterprise is protected to have better guarantee.
The present invention has been described in detail with reference to the accompanying drawings, those skilled in the art can be according to upper
It states and bright many variations example is made to the present invention.Thus, certain details in embodiment should not constitute limitation of the invention, this
Invention will be using the range that the appended claims define as protection scope of the present invention.
Claims (7)
1. a kind of identity theft detection method based on probability graph model, comprising steps of
S1: collecting and obtain and pre-process network payment transaction data, obtains a network payment transaction feature set;
S2: it is established using the network payment transaction feature set and obtains a probability graph model;
S3: the parameter of one training set of input and the training probability graph model, while the probability is obtained using Bayes' theorem
The conditional probability parameter of graph model;
S4: predicting a forecast set of input using the conditional probability parameter and the Bayes' theorem, and it is pre- to obtain one
Survey result.
2. the identity theft detection method according to claim 1 based on probability graph model, which is characterized in that the S1 step
Suddenly further comprise step:
S11: data scrubbing step fills in missing values, smooth noise and identification by carrying out to the network payment transaction data
Solve inconsistent formatting, the removing error correcting of abnormal data and the removing of repeated data to realize data of data;
S12: the unified storage of the network payment transaction data of multiple data sources is formed a database by data integration step;
S13: it is special that the network payment transaction data standardization processing in the database is formed into the network payment transaction
Collection is closed.
3. the identity theft detection method according to claim 2 based on probability graph model, which is characterized in that the S2 step
Suddenly further comprise step:
S21: obtaining the network payment transaction feature set θ, inputs a candidate feature set θ ', a set of relationship R, label category
Property Y and threshold value λ;Wherein, θ ' ∈ Φ, R ∈ Φ, Φ indicate empty set.
S22: the feature X for obtaining the network payment transaction feature set θ is calculated according to formula (1)iWith the mutual trust of tag attributes Y
Breath amount I:
Wherein, XiIndicate ith feature;I is the natural number more than or equal to 1;Y indicates tag attributes;X indicates XiValue;Y table
Show the value of Y;The joint probability of p (x, y) expression x and y;P (x) is the marginal probability of x;P (y) is the marginal probability of y;I(Xi;
Y X) is indicatediMutual information between Y;
S22: judge I (Xi;Y) whether it is more than or equal to preset threshold value λ;Such as it is to continue with subsequent step;
S23: the candidate feature set θ ' is updated according to formula (2):
θ ' :=θ '+Xi(2);
The network payment transaction feature set θ is updated according to formula (3):
θ :=θ-Xi(3);
S24: according to obtaining dependence r, r:Xi→Y;
S25: the set of relationship R is updated according to formula (4);
S26: judge whether the feature quantity in presently described candidate feature set θ ' is more than or equal to 2;It is such as to continue with subsequent step,
Otherwise return step S23;
S27: the mutual information between feature two-by-two is calculated in presently described candidate feature set θ ' according to formula (5):
Wherein, XiIndicate the i-th feature in θ ', XjIndicate the jth feature in θ ', i, j are greater than the natural number equal to 1;X is indicated
XiValue;X ' expression XjValue;The joint probability of p (x, x ') expression x and x ';P (x) is the marginal probability of x;P (x ') is x '
Marginal probability;I(Xi;Xj) indicate XiWith XjBetween mutual information;The set of relationship R is updated by formula (4);
S28: current θ ' is assigned to θ, and empties set θ ';Pass through the mutual trust between formula (5) set of computations θ two-by-two feature
Breath amount, if I (Xi;Xj) >=λ determines the dependence r between feature two-by-two then according to priori knowledge, is passing through formula (4)
Update the set of relationship R;
S29: repeating step S28 until θ is the I (X of empty or all featuresi;Xj)≤λ, at this time according to presently described set of relationship R
Obtain the probability graph model.
4. the identity theft detection method according to claim 3 based on probability graph model, which is characterized in that the S3 step
Suddenly further comprise step:
S31: one training set of input, the training set includes characteristic attribute and tag attributes;
S32: it is calculated according to formula (6) and obtains the conditional probability parameter:
Wherein, AiIndicate the i-th father node of the probability graph model;B indicates AiChild node;ptrain(Ai| B) indicate AiWith B it
Between conditional probability parameter;p(Ai) indicate AiMarginal probability;p(B|Ai) expression condition be AiWhen B occur probability;AjIt indicates
Jth father node;p(Aj) indicate AjMarginal probability;P(B|Aj) expression condition be AjWhen B occur probability.
S33: whether judgment formula (6) restrains, and is such as to continue with subsequent step, otherwise return step S31.
5. the identity theft detection method according to claim 4 based on probability graph model, which is characterized in that the S4 step
Suddenly further comprise step:
S41: one test set of input, the test set includes characteristic attribute Y ';
S42: calculating according to a formula (7) and obtain a posterior probability, exports the prediction result according to the posterior probability;
Wherein, p (Y ' | X1,…,Xn) expression condition be X1,…,XnWhen Y ' generation probability;P(X1,…,Xn| Y ') indicate condition
X when for Y '1,…,XnJoint probability;The marginal probability ... of P (Y ') expression Y ';P(X1,…,Xn) indicate X1,…,XnJoint
Probability.
6. the identity theft detection method according to claim 5 based on probability graph model, which is characterized in that the S4 step
It is further comprised the steps of: after rapid
S5: the prediction result is verified.
7. the identity theft detection method according to claim 6 based on probability graph model, which is characterized in that the S5 step
Suddenly further comprise step:
S51: according to the prediction result count obtain formula (7) model one by positive class determine be positive class quantity TP, one will
Negative class determine to be positive the quantity FP of class, one positive class determine the to be negative quantity FN and one of class is determined that negative class be negative the quantity of class
TN;
S52: it is calculated according to a formula (8) and obtains an accurate rate precision:
It is calculated according to a formula (9) and obtains a recall rate recall:
Acquisition one, which is calculated, according to a formula (10) bothers rate disturb:
S53: it according to the accurate rate, the recall rate and described bother rate and evaluates the prediction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910148549.8A CN109993538A (en) | 2019-02-28 | 2019-02-28 | Identity theft detection method based on probability graph model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910148549.8A CN109993538A (en) | 2019-02-28 | 2019-02-28 | Identity theft detection method based on probability graph model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109993538A true CN109993538A (en) | 2019-07-09 |
Family
ID=67130436
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910148549.8A Pending CN109993538A (en) | 2019-02-28 | 2019-02-28 | Identity theft detection method based on probability graph model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109993538A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111046957A (en) * | 2019-12-13 | 2020-04-21 | 支付宝(杭州)信息技术有限公司 | Model embezzlement detection method, model training method and device |
CN111800389A (en) * | 2020-06-09 | 2020-10-20 | 同济大学 | Port network intrusion detection method based on Bayesian network |
CN111860647A (en) * | 2020-07-21 | 2020-10-30 | 金陵科技学院 | Abnormal consumption mode judgment method |
CN112153221A (en) * | 2020-09-16 | 2020-12-29 | 北京邮电大学 | Communication behavior identification method based on social network diagram calculation |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150134512A1 (en) * | 2013-11-13 | 2015-05-14 | Mastercard International Incorporated | System and method for detecting fraudulent network events |
CN106910071A (en) * | 2017-01-11 | 2017-06-30 | 中国建设银行股份有限公司 | The verification method and device of user identity |
CN107615326A (en) * | 2015-01-20 | 2018-01-19 | 口袋医生公司 | Use the healthy balance system and method for probability graph model |
CN108376300A (en) * | 2018-03-02 | 2018-08-07 | 江苏电力信息技术有限公司 | A kind of user power utilization behavior prediction method based on probability graph model |
CN108492173A (en) * | 2018-03-23 | 2018-09-04 | 上海氪信信息技术有限公司 | A kind of anti-Fraud Prediction method of credit card based on dual-mode network figure mining algorithm |
CN109360099A (en) * | 2018-10-22 | 2019-02-19 | 广东工业大学 | A kind of anti-fraud method of finance based on k- nearest neighbor algorithm |
-
2019
- 2019-02-28 CN CN201910148549.8A patent/CN109993538A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150134512A1 (en) * | 2013-11-13 | 2015-05-14 | Mastercard International Incorporated | System and method for detecting fraudulent network events |
CN107615326A (en) * | 2015-01-20 | 2018-01-19 | 口袋医生公司 | Use the healthy balance system and method for probability graph model |
CN106910071A (en) * | 2017-01-11 | 2017-06-30 | 中国建设银行股份有限公司 | The verification method and device of user identity |
CN108376300A (en) * | 2018-03-02 | 2018-08-07 | 江苏电力信息技术有限公司 | A kind of user power utilization behavior prediction method based on probability graph model |
CN108492173A (en) * | 2018-03-23 | 2018-09-04 | 上海氪信信息技术有限公司 | A kind of anti-Fraud Prediction method of credit card based on dual-mode network figure mining algorithm |
CN109360099A (en) * | 2018-10-22 | 2019-02-19 | 广东工业大学 | A kind of anti-fraud method of finance based on k- nearest neighbor algorithm |
Non-Patent Citations (1)
Title |
---|
柴洪峰等: ""基于数据挖掘的异常交易检测方法"", 《计算机应用与软件》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111046957A (en) * | 2019-12-13 | 2020-04-21 | 支付宝(杭州)信息技术有限公司 | Model embezzlement detection method, model training method and device |
CN111046957B (en) * | 2019-12-13 | 2021-03-16 | 支付宝(杭州)信息技术有限公司 | Model embezzlement detection method, model training method and device |
CN111800389A (en) * | 2020-06-09 | 2020-10-20 | 同济大学 | Port network intrusion detection method based on Bayesian network |
CN111860647A (en) * | 2020-07-21 | 2020-10-30 | 金陵科技学院 | Abnormal consumption mode judgment method |
CN111860647B (en) * | 2020-07-21 | 2023-11-10 | 金陵科技学院 | Abnormal consumption mode judging method |
CN112153221A (en) * | 2020-09-16 | 2020-12-29 | 北京邮电大学 | Communication behavior identification method based on social network diagram calculation |
CN112153221B (en) * | 2020-09-16 | 2021-06-29 | 北京邮电大学 | Communication behavior identification method based on social network diagram calculation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109993538A (en) | Identity theft detection method based on probability graph model | |
Mittal et al. | Performance evaluation of machine learning algorithms for credit card fraud detection | |
CN108665159A (en) | A kind of methods of risk assessment, device, terminal device and storage medium | |
CN108734380B (en) | Risk account determination method and device and computing equipment | |
CN109741173B (en) | Method, device, equipment and computer storage medium for identifying suspicious money laundering teams | |
CN109410036A (en) | A kind of fraud detection model training method and device and fraud detection method and device | |
CN109598331A (en) | A kind of fraud identification model training method, fraud recognition methods and device | |
CN106709800A (en) | Community partitioning method and device based on characteristic matching network | |
CN108960833A (en) | A kind of abnormal transaction identification method based on isomery finance feature, equipment and storage medium | |
CN109635007B (en) | Behavior evaluation method and device and related equipment | |
CN103577988A (en) | Method and device for recognizing specific user | |
US11372526B2 (en) | Method for anomaly detection in clustered data structures | |
CN108182627A (en) | A kind of system that user credit assessment is realized according to user behavior | |
CN105303447A (en) | Method and device for carrying out credit rating through network information | |
CN106779723A (en) | A kind of mobile terminal methods of risk assessment and device | |
Simak | Inverse and negative DEA and their application to credit risk evaluation | |
CN109086927A (en) | In conjunction with the multiple-factor method of commerce of big data the analysis of public opinion and Fusion Model | |
Koralun-Bereźnicka | Corporate performance | |
CN112950347A (en) | Resource data processing optimization method and device, storage medium and terminal | |
CN110533528A (en) | Assess the method and apparatus of business standing | |
Dong et al. | Real-time Fraud Detection in e-Market Using Machine Learning Algorithms. | |
KR102646316B1 (en) | Method and apparatus for identifying genuine article of seized movable property, and electronic auction system using the same | |
CN109635289A (en) | Entry classification method and audit information abstracting method | |
Ranjan et al. | Fraud detection on bank payments using machine learning | |
Huang et al. | Multidimensional reputation evaluation model for crowdsourcing participants based on big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190709 |
|
RJ01 | Rejection of invention patent application after publication |