A kind of risk prevention system processing method, device and equipment
Technical field
This specification is related to a kind of field of computer technology more particularly to risk prevention system processing method, device and equipment.
Background technology
More prevalent with network skill and terminal technology, risk present in network trading is also more and more, although net
There are risk control rules in the operation systems such as road transaction, and still, there is no therefore reduce, operation system for network trading risk
In risk control still suffer from huge challenge.
There would generally be a set of Risk Control System based on risk data in operation system, by constantly manually being adjusted
Whole business rule is realized.Strategy operation is indispensable link in Risk Control System, while being also and extraneous risk pair
The mode of anti-main path, at present strategy operation is that the mode based on artificial strategy is realized, that is, after finding risk trade, aweather
Dangerous control system reports or complains the transaction, by manual analysis, the feature of artificial extraction risk trade data, and by artificial
Off-line data assessment is carried out to above-mentioned risk trade data, finally, passes through the corresponding risk prevention system rule of human configuration.
However, obtaining the processing of risk prevention system rule by artificial mode, excessive repeatability, machinery are manually assumed responsibility for
The work of property, causes a large amount of human resources and material resources to be wasted, and also need to consume a large amount of time, in current wind
In the case that dangerous attacking and defending rhythm is gradually accelerated, risk prevention system is carried out in the above described manner and cannot be satisfied user demand, therefore, is being taken advantage of
Equivalent risk prevention and control fields is cheated, a kind of timeliness higher, response more timely solution are needed.
Invention content
The purpose of this specification embodiment is to provide a kind of risk prevention system processing method, device and equipment, to provide one kind
More timely risk prevention system handles solution for timeliness higher, response.
To realize that above-mentioned technical proposal, this specification embodiment are realized in:
A kind of risk prevention system processing method that this specification embodiment provides, the method includes:
The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the target service
No deal risk streaming transaction data as white sample flow data;
Determine the number that the black sample flow data and the white sample flow data are repetitively sampled;
According to the black sample flow data and the white sample flow data and the black sample flow data and the white sample
This flow data is repeated the number of sampling, establishes the first decision-tree model;
Based on the black sample flow data and/or the white sample flow data, using first decision-tree model to institute
The current corresponding decision-tree model of risk prevention system rule of target service is stated to be updated.
Optionally, the risk prevention system rule current to the target service using first decision-tree model corresponds to
Decision-tree model be updated after, the method further includes:
Based on updated decision-tree model, the risk prevention system rule current to the target service is updated.
Optionally, the black sample flow data includes being used to indicate streaming transaction data as streaming risk trade data
Label,
The recent streaming risk trade data of target service that obtain obtain the target as black sample flow data
The stream data of the no deal risk of business as white sample flow data, including:
Obtain the corresponding characteristic variable of streaming transaction data of target service;
The corresponding characteristic variable of streaming transaction data of no deal risk is obtained from the characteristic variable as white sample
Flow data;
The information of the label is matched with the characteristic variable, the characteristic variable to be matched is as black sample
Flow data.
Optionally, the method further includes:
Delay process is carried out to the black sample flow data and the white sample flow data, with the determination black sample fluxion
According to the quantitative proportion with the white sample flow data.
Optionally, the method further includes:
Lack sampling processing is carried out to the white sample flow data, with the quantitative proportion of the determination white sample flow data.
Optionally, the number that the determination black sample flow data and the white sample flow data are repetitively sampled, packet
It includes:
The random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution;
According to the numerical value of the random number of the Poisson distribution, the black sample flow data and the white sample flow data are determined
It is repeated the number of sampling.
Optionally, described according to the black sample flow data and the white sample flow data and the black sample fluxion
According to the number for being repeated sampling with the white sample flow data, the first decision-tree model is established, including:
The first black sample flow data and the first white sample are chosen from the black sample flow data and white sample flow data respectively
This flow data;
According to the described first black sample flow data and the first white sample flow data and the first black sample fluxion
According to the number for being repeated sampling with the described first white sample flow data, the first decision-tree model is established.
Optionally, described according to the described first black sample flow data and the first white sample flow data and described
One black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model, including:
Decision-tree model is created, establishment inequality condition at split vertexes is waited in the decision-tree model;
According to the inequality condition, son is carried out to the described first black sample flow data and the first white sample flow data
The pre- division of node;
Calculate the gain of the corresponding inequality condition of pre- division of each child node;
The division for waiting for split vertexes and carrying out child node is obtained according to the gain and corresponding inequality condition
Child node after division;
Based on the child node and the first black sample flow data and the first white sample flow data after the division
It is repeated the number of sampling, establishes the first decision-tree model.
Optionally, described to be based on the black sample flow data and/or the white sample flow data, use first decision
Tree-model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current, including:
Selected part data are used as assessment sample number from the black sample flow data and/or white sample flow data respectively
According to;
According to the assessment sample data, the risk current to first decision-tree model and the target service respectively
The corresponding decision-tree model of prevention and control rule is assessed;
If first decision-tree model meets scheduled evaluation condition, and the risk prevention system that the target service is current
There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the corresponding decision-tree model of rule, then uses described first
Decision-tree model replaces second decision-tree model.
A kind of risk prevention system processing unit that this specification embodiment provides, described device include:
Sample acquisition module, for obtaining the recent streaming risk trade data of target service as black sample flow data,
And obtain the target service no deal risk streaming transaction data as white sample flow data;
Sampling number determining module, for determining that the black sample flow data and the white sample flow data are repetitively sampled
Number
Model building module, for according to the black sample flow data and the white sample flow data and the black sample
This flow data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model;
Model modification module uses described for being based on the black sample flow data and/or the white sample flow data
One decision-tree model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current.
Optionally, described device further includes:
Policy Updates module, for being based on updated decision-tree model, the risk prevention system current to the target service
Rule is updated.
Optionally, the black sample flow data includes being used to indicate streaming transaction data as streaming risk trade data
Label,
The sample acquisition module, including:
Feature acquiring unit, the corresponding characteristic variable of streaming transaction data for obtaining target service;
First selection unit, the corresponding spy of streaming transaction data for obtaining no deal risk from the characteristic variable
Variable is levied as white sample flow data;
Matching unit, for the information of the label to be matched with the characteristic variable, the feature to be matched
Variable is as black sample flow data.
Optionally, described device further includes:
Time delay module, for carrying out delay process to the black sample flow data and the white sample flow data, with determination
The quantitative proportion of the black sample flow data and the white sample flow data.
Optionally, described device further includes:
Lack sampling module, for carrying out lack sampling processing to the white sample flow data, with the determination white sample fluxion
According to quantitative proportion.
Optionally, the sampling number determining module, including:
Allocation unit, for being respectively that each black sample flow data and each white sample flow data distribute Poisson
The random number of distribution;
Sampling number determination unit is used for the numerical value of the random number according to the Poisson distribution, determines the black sample flow
Data and the white sample flow data are repeated the number of sampling.
Optionally, the model building module, including:
Second selection unit, for choosing the first black sample from the black sample flow data and white sample flow data respectively
Flow data and the first white sample flow data;
Model foundation unit is used for according to the described first black sample flow data and the first white sample flow data, and
The first black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision tree mould
Type.
Optionally, the model foundation unit, is used for:
Decision-tree model is created, establishment inequality condition at split vertexes is waited in the decision-tree model;
According to the inequality condition, son is carried out to the described first black sample flow data and the first white sample flow data
The pre- division of node;
Calculate the gain of the corresponding inequality condition of pre- division of each child node;
The division for waiting for split vertexes and carrying out child node is obtained according to the gain and corresponding inequality condition
Child node after division;
Based on the child node and the first black sample flow data and the first white sample flow data after the division
It is repeated the number of sampling, establishes the first decision-tree model.
Optionally, the model modification module, including:
Third selection unit, for the selected part number from the black sample flow data and/or white sample flow data respectively
According to as assessment sample data;
Assessment unit is used for according to the assessment sample data, respectively to first decision-tree model and the target
The current corresponding decision-tree model of risk prevention system rule of business is assessed;
Model modification unit, if meeting scheduled evaluation condition, and the target for first decision-tree model
There is the second decision tree for being unsatisfactory for scheduled evaluation condition in the current corresponding decision-tree model of risk prevention system rule of business
Model then replaces second decision-tree model using first decision-tree model.
A kind of risk prevention system processing equipment that this specification embodiment provides, the risk prevention system processing equipment include:
Processor;And
It is arranged to the memory of storage computer executable instructions, the executable instruction makes the place when executed
Manage device:
The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the target service
No deal risk streaming transaction data as white sample flow data;
Determine the number that the black sample flow data and the white sample flow data are repetitively sampled;
According to the black sample flow data and the white sample flow data and the black sample flow data and the white sample
This flow data is repeated the number of sampling, establishes the first decision-tree model;
Based on the black sample flow data and/or the white sample flow data, using first decision-tree model to institute
The current corresponding decision-tree model of risk prevention system rule of target service is stated to be updated.
The technical solution provided by above this specification embodiment is as it can be seen that this specification embodiment passes through acquisition target service
Recent streaming risk trade data obtain the streaming number of deals of the no deal risk of target service as black sample flow data
According to as white sample flow data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, Ke Yigen
The number of sampling is repeated according to black sample flow data and white sample flow data and black sample flow data and white sample flow data,
The first decision-tree model is established, finally, black sample flow data and/or white sample flow data can be based on, use the first decision tree
Model is updated the corresponding decision-tree model of risk prevention system rule that target service is current, in this way, passing through the recent of acquisition
Black sample flow data and white sample flow data generate the first decision-tree model, and by the first decision-tree model to target service
The current corresponding decision-tree model of risk prevention system rule carries out real-time update, without manually participating in that fraud can be completed
The update of equivalent risk prevention and control rule, and the update to risk prevention system rule is completed by Recent data, it realizes for subsequently taking advantage of
The quick reply for cheating equivalent risk can be greatly decreased the money damage that new risk is brought, improve the life of fraud equivalent risk prevention and control rule
At efficiency, and then improve the safety of target service.
Description of the drawings
In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or
Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only
Some embodiments described in this specification, for those of ordinary skill in the art, in not making the creative labor property
Under the premise of, other drawings may also be obtained based on these drawings.
Fig. 1 is a kind of risk prevention system processing method embodiment of this specification;
Fig. 2 is a kind of schematic diagram of decision-tree model of this specification;
Fig. 3 is this specification another kind risk prevention system processing method embodiment;
Fig. 4 is a kind of risk prevention system processing unit embodiment of this specification;
Fig. 5 is a kind of risk prevention system processing equipment embodiment of this specification.
Specific implementation mode
A kind of risk prevention system processing method of this specification embodiment offer, device and equipment.
In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation
Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described
Embodiment be only this specification a part of the embodiment, instead of all the embodiments.The embodiment of base in this manual,
The every other embodiment that those of ordinary skill in the art are obtained without creative efforts, should all belong to
The range of this specification protection.
Embodiment one
As shown in Figure 1, this specification embodiment provides a kind of risk prevention system processing method, the executive agent of this method can be with
For terminal device or server, wherein the terminal device can such as personal computer equipment, can also as mobile phone, tablet electricity
The mobile terminal devices such as brain, the terminal device can be the terminal device that user uses.The server can be independent service
Device can also be the server cluster being made of multiple servers, moreover, the server can be the background service of a certain business
Device can also be the background server etc. of certain website (such as websites or payment application).This method can be used for pair
Business rule in operation system is updated etc. in processing, with executive agent is service in the present embodiment to improve efficiency
It illustrates for device, the case where for terminal device, can be handled according to following related contents, details are not described herein.The party
Method can specifically include following steps:
In step s 102, the recent streaming risk trade data of target service are obtained as black sample flow data, and are obtained
Take the streaming transaction data of the no deal risk of the target service as white sample flow data.
Wherein, target service can be arbitrary business, such as on-line payment business, shopping at network business etc..Streaming risk
Transaction data can be during network trading a certain moment get there are the streaming transaction data of transaction risk,
In transaction risk may include causing damages to the resource (fund of such as user) of user or causing damages to target service
Deng.The streaming risk trade data of scheduled duration before recent streaming risk trade data can be current time, when current
Before quarter before scheduled duration such as current time in 24 hours or in 5 days.
It in force, can be by way of blacklist or white list in order to ensure the safety of data in operation system
Risk trade that may be present in operation system is carried out the processing such as intercepting.With the continuous improvement of network technology, internet worm
Or network trojan horse program etc. is more and more, and blacklist or white list are difficult to cover all possible internet worm or net in time
Network trojan horse program or other there are the transaction of risk, in this way, in practical applications, user can encounter blacklist or white list
In the transaction risk that can not cover, at this point, user needs to judge whether to the transaction of target service, if the user determine that this
There are risks for transaction, then can terminate this transaction, and can store the relevant information of this transaction, or can be by this
The relevant information of transaction is sent to the operation system of target service.If user has carried out above-mentioned transaction, and completes above-mentioned friendship
Determine that last transaction is risk trade after easily, at this point, user can store the relevant information of this transaction, or can be by this
The relevant information of secondary transaction is sent to the operation system of target service.
Can all have a set of Risk Control System based on risk data in operation system, by constantly being manually adjusted
Business rule is realized.Strategy operation is indispensable link in Risk Control System, while being also and extraneous risk resisting
Main path, the mode of strategy operation at present is the mode based on artificial strategy, and process flow can be:It was found that transaction wind
After the risk trade data of danger or risk trade, the transaction is reported or complained to Risk Control System, by manual analysis on this
Report or the transaction complained, it is qualitative to be carried out to the transaction, it is then possible to manually extract the feature of risk trade data, and pass through
After manually carrying out off-line data assessment to above-mentioned risk trade data, corresponding risk control strategy can be configured to supplement or replace
Change the risk prevention system rule in Risk Control System.It is then possible to trial operation risk prevention system rule, if obtained result symbol
Expected results are closed, then the risk prevention system rule can be added in the operation system of target service, to be carried out to risk trade
Prevention and control.However, obtaining the processing of risk prevention system rule by artificial mode, excessive repeatability, mechanicalness are manually assumed responsibility for
Work, not only result in the passage of time, also result in the waste of a large amount of human resources and material resources, moreover, even if from
Manually transaction data analysis is started to calculate, the time of above-mentioned a whole set of flow consumption can be long, under normal circumstances all at least
As soon as week or more, by taking a kind of new fraud gimmick or risk trade as an example, one week fermentation time is enough to cause largely to provide
Damage, may endanger thousands of user.Currently, the form of risk prevention system is very severe, especially cheats scene, due to fraud
The capital quantity that scene is related to is very high, and the profit margin for cheating " black production " is very big, so the new method of fraud, new tool layer go out not
Thoroughly, the attacking and defending tempo variation for cheating security fields is very fast.On above-mentioned Risk Control System framework, fraudulent trading data
Manual analysis needs the response of time, risk prevention system model that the deployment of time, risk control strategy is needed to be also required to the time, to
It causes the response for cheating equivalent risk not prompt enough, certain money is caused to damage.Therefore, it in fraud equivalent risk prevention and control field, needs
Want a kind of timeliness higher, response more timely solution.For this purpose, this specification embodiment, which provides one kind, being based on newest stream
The risk prevention system processing mode of formula data, can specifically include the following contents:
For a certain item business (i.e. target service), may there are some omissions or newly-increased risk in practical applications
Transaction, and the related data of these risk trades will not be intercepted by the currently used business rule of target service, so as to
It can cause damages to the resource of user, it, can be to the risk prevention system rule of target service in order to promote the safety of target service
It is supplemented or is updated in time, can obtain that relevant there are the streamings of transaction risk with target service in several ways thus
Risk trade data, for example, user is during using target service, it, can be real-time if it find that there are the transaction of risk
It is reported or is complained to corresponding operation system, operation system can receive the risk at the time point of reporting of user or complaint
The stream data etc. of transaction can also obtain risk trade data, such as target service otherwise in practical applications
Operator can to user buy or by exchange etc. modes obtain user storage or collect there are the correlations of the transaction of risk
Stream data etc..Wherein, streaming risk trade data need user's active feedback, need to undergo the regular hour, actually answer
In, the streaming risk trade data of scheduled duration before current time can be got, for example, current time was 10 o'clock,
The streaming risk trade data in 24 hours 10 o'clock to current times of yesterday or the stream in a longer period of time can then be obtained
Formula risk trade data etc..
Reporting of user can be directed to or the risk trade of complaint is tried, determine the stream of wherein physical presence transaction risk
Formula data, through the above way by risk trade it is qualitative after, the streaming risk trade data of target service can be obtained, can will
The streaming risk trade data are as the black sample flow data in streaming sample data.In addition, in order to which the risk subsequently obtained is anti-
Regulatory control is then more accurate, can also include a certain number of white sample flow datas in streaming sample data, white sample flow data can
It, can also be with the selection duration of black sample flow data not to choose identical with the black selection duration of sample flow data selection duration
Same selection duration moreover, the selection duration of white sample flow data can be more than the selection duration of black sample flow data, such as selects
The streaming transaction data etc. of the no deal risk (or transaction or non-transaction reported of non-complaint) before certain number of days is taken, in this way
It can ensure as far as possible as few as possible doped with black sample flow data in white sample flow data.
In step S104, the number that above-mentioned black sample flow data and above-mentioned white sample flow data are repetitively sampled is determined.
In force, for the obtained streaming risk trade data (i.e. black sample flow data) of target service and no deal
The streaming transaction data (i.e. white sample flow data) of risk can be generated newly by the combination of a certain algorithm or many algorithms
Risk prevention system is regular, and online random forests algorithm may be used in this specification embodiment and realize.Online random forests algorithm needs
Sample data is carried out that sampling or repeatable sampling (i.e. some sample data can be sampled and using multiple) can be put back to,
Since streaming transaction data can only obtain at some time point, it is unable to that all sample numbers can be traversed at any point in time
According to therefore, in order to realize the repeatable sampling of convection type transaction data, it is respectively black sample fluxion that can set certain mechanism
The number being repetitively sampled is distributed according to white sample flow data, which may be set according to actual conditions, specifically such as, Ke Yisui
Machine distribution etc., this specification embodiment does not limit this.
In step s 106, according to above-mentioned black sample flow data and white sample flow data and black sample flow data and in vain
Sample flow data is repeated the number of sampling, establishes the first decision-tree model.
Wherein, the first decision-tree model can be it is known it is various happen probability on the basis of, pass through constitute decision
The model for probability of the desired value more than or equal to zero for setting to seek net present value (NPV), can be used for assessment item risk or business risk
Deng, can be it is a kind of judge its feasibility method of decision analysis.As shown in Fig. 2, the first decision-tree model can be business category
A kind of mapping relations model between property and service attribute value, each node in decision tree indicate some service attribute, and every
A diverging paths then represent some possible service attribute value, and each leaf node in decision tree then correspond to from root node to this
Service attribute value represented by the path that leaf node is undergone.The purpose of first decision-tree model can be concentrated in a data
An optimal characteristics are found, a best candidate value is then found from the choosing value of the optimal characteristics, it is optimal according to what is obtained
Data set is divided into two Sub Data Sets by candidate value, then the above-mentioned processing procedure of recurrence, until meeting specified requirements.
In force, a decision-tree model can be established by obtained black sample flow data and white sample flow data
(i.e. the first decision-tree model), it is then possible to the number of sampling is repeated according to black sample flow data and white sample flow data, point
Multiple training sample flow datas are not formed, and establish decision-tree model respectively, to obtain multiple first decision-tree models.
It should be noted that as shown in Fig. 2, the wind based on above-mentioned risk trade data can be exported by decision-tree model
Dangerous prevention and control rule, wherein the risk prevention system rule of output may include one or more specific business rules.
In step S108, it is based on above-mentioned black sample flow data and/or above-mentioned white sample flow data, is determined using above-mentioned first
Plan tree-model is updated the corresponding decision-tree model of risk prevention system rule that target service is current.
In force, after the processing of S106 obtains the first decision-tree model through the above steps, black sample flow can be used
Data and/or white sample flow data, the risk prevention system rule current to each first decision-tree model and target service correspond to
Decision-tree model assessed, determine the accuracy rate situation of the corresponding decision-tree model of current risk prevention system rule, if
The accuracy rate of the current corresponding decision-tree model of risk prevention system rule meets scheduled accuracy rate condition, can retain current
The corresponding decision-tree model of risk prevention system rule, likewise, can also determine the accuracy rate situation of the first decision-tree model, if
It meets scheduled accuracy rate condition, then can be by the first decision-tree model and the corresponding decision tree of current risk prevention system rule
Model is combined, and obtains the corresponding new decision-tree model of target service, and can determine target industry based on new decision-tree model
It is engaged in corresponding risk prevention system rule.If the accuracy rate of the current corresponding decision-tree model of risk prevention system rule is unsatisfactory for making a reservation for
Accuracy rate condition, above-mentioned first decision-tree model can be used to replace the corresponding decision tree mould of current risk prevention system rule
Type, and the corresponding risk prevention system rule of target service can be determined based on the decision-tree model obtained after replacement.
This specification embodiment provides a kind of risk prevention system processing method, the streaming risk recent by obtaining target service
Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow
Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data
It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree
Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service
The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and
White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service
The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed
Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk
Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve
The safety of target service.
Embodiment two
As shown in figure 3, this specification embodiment provides a kind of risk prevention system processing method, the executive agent of this method can be with
For terminal device or server, wherein the terminal device can such as personal computer equipment, can also as mobile phone, tablet electricity
The mobile terminal devices such as brain, the terminal device can be the terminal device that user uses.The server can be independent service
Device can also be the server cluster being made of multiple servers, moreover, the server can be the background service of a certain business
Device can also be the background server etc. of certain website (such as websites or payment application).This method can be used for pair
Business rule in operation system is updated etc. in processing, with executive agent is service in the present embodiment to improve efficiency
It illustrates for device, the case where for terminal device, can be handled according to following related contents, details are not described herein.
In this specification embodiment, mainly for the analysis of above-mentioned artificial data in the related technology, the people of risk prevention system model
The multiple portions such as the artificial deployment of work iteration, risk prevention system rule optimize, and are described in detail separately below.This method has
Body may comprise steps of:
In step s 302, the corresponding characteristic variable of streaming transaction data of target service is obtained.
Wherein, streaming transaction data may include that there are the streaming risk trade data of transaction risk and no deal risks
Streaming transaction data etc., wherein may include the streaming transaction data etc. based on fraud in streaming risk trade data.Feature becomes
Amount can be a kind of real value monotropic function of streaming transaction data, characteristic variable may include a variety of ways of realization, for example, with to
The form of amount indicates that, alternatively, being indicated in the form of mathematic(al) representation, this specification embodiment does not limit this.
In force, it is contemplated that the streaming risk trade data of target service need to rely on user's active reporting, and report
Streaming risk trade data need certain time limit, moreover, the content of streaming transaction data is often more, wherein can carry very much
With determine the streaming transaction data content that whether to be streaming risk trade data unrelated, in order to mitigate the processing load of system,
Feature extraction can be carried out with convection type transaction data, streaming transaction data is characterized by the form of characteristic variable.Therefore, this theory
When each transaction event occurs, the feature that can calculate the corresponding streaming transaction data of the transaction event becomes bright book embodiment
Amount, wherein in order to obtain enough characteristic variables, the transaction event occurred in certain time length can be obtained, such as can
To obtain the transaction event occurred in 3 months or in half a year, so as to obtain the corresponding streaming transaction of each transaction event
Data.The corresponding characteristic variable of streaming transaction data in scheduled duration can be got through the above way.
In step s 304, the corresponding feature of streaming transaction data that no deal risk is obtained from features described above variable becomes
Amount is used as white sample flow data.
In force, characteristic variable can be obtained by above-mentioned processing procedure, can be set up based on obtained characteristic variable
Sample variable pond.Since characteristic variable is to be obtained based on each transaction event, and transaction event includes in the presence of transaction wind
The transaction event of the transaction event and no deal risk of danger, likewise, may include streaming risk trade data in characteristic variable
The corresponding characteristic variable of streaming transaction data of corresponding characteristic variable and no deal risk therefore can be from sample variable pond
The corresponding characteristic variable of streaming transaction data of middle extraction no deal risk, and can be using the characteristic variable of extraction as white sample
Flow data.
In view of reporting of user streaming risk trade data for a user, may more understand streaming risk trade
Data there are which kind of transaction risk, which kind of transaction consequence etc. will produce, therefore, user reports streaming risk trade data every time
When, streaming risk trade data creating label of the user to upload can be asked, streaming risk friendship can be recorded in the label
The risk attributes (such as payment fraud or Telecoms Fraud etc.) of easy data, that is to say streaming risk trade data (i.e. following black samples
This flow data) include being used to indicate the label that streaming transaction data is streaming risk trade data, for streaming risk trade
The processing of data can be realized by the processing mode of following steps S306.
In step S306, the recent streaming risk trade data of target service are obtained, and by streaming risk trade data
In the information of label matched with features described above variable, the characteristic variable to be matched is as black sample flow data.
In force, after the upper transmission/tream type risk trade data of user, which can be tried,
It is qualitative to be carried out to the streaming risk trade data, if trial result determines the streaming risk trade data in practical applications
Belong to the streaming transaction data there are transaction risk, then can extract the label in the streaming risk trade data, and to the mark
Information in label is analyzed, and analysis result is obtained, and then, is in real time constituted the information of the label and features described above variable
Sample variable pond is matched, and the characteristic variable in the sample variable pond that can be will match to is as black sample flow data.
It should be noted that the execution of above-mentioned steps S304 and step S306 are realized according to sequencing, in reality
In, the processing of above-mentioned steps S304 can be first carried out, then executes the processing of above-mentioned steps S306 again, however, in this theory
In another embodiment of bright book, the processing of above-mentioned steps S306 can also be first carried out, then executes above-mentioned steps S304's again
Processing, alternatively, the processing of above-mentioned steps S304 and step S306 can also be performed simultaneously, this specification embodiment does not limit this
It is fixed.
It, can be in order to obtain timeliness higher, response more timely risk prevention system rule in this specification embodiment
Risk prevention system rule is obtained using on-line study (i.e. Online Learning) algorithm, in order to ensure to calculate by on-line study
The performance for the risk prevention system model that method obtains, can be according to the concentration of certain streaming sample data to above-mentioned black sample flow data
It is handled with above-mentioned white sample flow data, specifically may refer to the processing of following step S308.
In step S308, delay process is carried out to above-mentioned black sample flow data and above-mentioned white sample flow data, with determination
The quantitative proportion of black sample flow data and white sample flow data.
Wherein, delay strategy can be for determining whether the quantitative proportion of black sample flow data and white sample flow data is full
The data controlling mechanism of sufficient predetermined ratio threshold value, predetermined ratio threshold value therein can be specific such as 1 determines according to actual conditions:1
Or 1:2 etc..
In force, in order to ensure to be input to black sample flow data and white sample flow data in above-mentioned risk prevention system model
Ratio is appropriate, can pre-set delay strategy, and black sample flow data and white sample flow can be effectively controlled by delay strategy
Input ratio between data, to ensure that black sample flow data and the quantitative proportion of white sample flow data meet predetermined ratio threshold
Value.
In addition, for white sample flow data, white sample flow data can also be ensured in streaming sample number otherwise
Shared ratio, can specifically be accomplished by the following way in:Lack sampling processing is carried out to above-mentioned white sample flow data, with true
The quantitative proportion of the fixed white sample flow data.
Wherein, can to refer to white sample flow data (including white in streaming sample data for the quantitative proportion of white sample flow data
Sample flow data and black sample flow data) in shared ratio, specific such as 50% or 30%.Lack sampling processing can be sampling
Frequency is less than twice of sampling processing of signal highest frequency.
Lack sampling processing is carried out to above-mentioned white sample flow data by above-mentioned, it is ensured that be input in on-line learning algorithm
The concentration of the streaming sample data learnt is not too low.
It can realize that the foundation of decision-tree model and risk prevention system are advised in this specification embodiment by on-line learning algorithm
Update then, in practical applications, on-line learning algorithm may include a variety of, can be calculated with on-line study in this specification embodiment
Method is illustrates for online random forests algorithm (i.e. Online Random Forests), for passing through other online
It practises the concrete processing procedure that algorithm is realized and may refer to following related contents, details are not described herein.
In step S310, the first black sample fluxion is chosen from above-mentioned black sample flow data and white sample flow data respectively
According to the first white sample flow data.
In force, above-mentioned black sample flow data can be divided into two parts, a part of black sample flow data can be with
For building decision-tree model, the black sample flow data of another part can be used for assessing the accuracy of decision-tree model,
Likewise, above-mentioned white sample flow data can also be divided into two parts, the white sample flow data of a part can be used for building
Decision-tree model, the white sample flow data of another part can be used for assessing the accuracy of decision-tree model, for this purpose, can be with
Pass through scheduled division proportion (such as 5:1 or 7:3 etc.) or the mode randomly selected, is chosen from above-mentioned black sample flow data
One black sample flow data, and the first white sample flow data is chosen from above-mentioned white sample flow data.
In practical applications, the processing mode of above-mentioned steps S310 can be varied, provides again below a kind of optional
Realization method specifically may refer to following step one and step 2 processing.
Step 1, the random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution.
In force, according to the characteristic of online random forests algorithm, convection type sample data (including black sample fluxion is needed
According to white sample flow data) carry out Bagging operations and (or be referred to as bagging operation, essence can be put back to
Sampling operation), still, streaming transaction data can only obtain at some time point, and being unable at any point in time can be all over
All sample datas are gone through, therefore, in order to realize the Bagging operations of convection type transaction data, when obtaining every streaming sample number
According to, and when use streaming sample data establishment decision-tree model, can first assign one Poisson of every streaming sample data point
The random number of cloth, the random number of as each black sample flow data and each white sample flow data distribution Poisson distribution.Wherein, divide
The random number for the Poisson distribution matched is nonnegative integer.
Step 2 determines the first black sample flow data and the first white sample according to the numerical value of the random number of above-mentioned Poisson distribution
This flow data.
In force, it if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is 0, can incite somebody to action
The streaming sample data is set as assessing sample data, so as to subsequently to comment indexs such as the accuracys of decision-tree model
Estimate, it, can should be with if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is a positive integer
The numerical value of machine number as the streaming sample data " by double sampling " number, to realize for streaming transaction data
Bagging is operated.The streaming sample that can will be made of black sample flow data and white sample flow data above-mentioned processing procedure
Data are divided into two parts, and a part is assessment sample data, and another part is white by the first black sample flow data and first
The streaming sample data of sample flow data composition.
It should be noted that the data bulk in the first black sample flow data can be less than the data in black sample flow data
Quantity, the data bulk that can also be equal in black sample flow data, likewise, the data bulk in the first white sample flow data can
With the data bulk that less than the data bulk in white sample flow data, can also be equal in white sample flow data.
In step S312, the number that above-mentioned black sample flow data and above-mentioned white sample flow data are repetitively sampled is determined.
The processing of above-mentioned steps S312 can be varied, and a kind of optional processing mode presented below can specifically wrap
Include following steps one and step 2.
Step 1, the random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution.
It, can be only to for each first black sample flow data and each first white it should be noted that be based on the above
Sample flow data distributes the random number of Poisson distribution.
Step 2 determines black sample flow data and white sample flow data quilt according to the numerical value of the random number of the Poisson distribution
The number of duplicate sampling.
In force, if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is one just whole
Number, then can be by the numerical value of the random number as number (i.e. streaming sample data of the streaming sample data " by double sampling "
It is repeated the number of sampling), it is black so as to obtain first to realize the Bagging operations for streaming transaction data
Sample flow data and the first white sample flow data are repeated the number of sampling.
In step S314, according to the above-mentioned first black sample flow data and above-mentioned first white sample flow data and first
Black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model.
It in force, can be relevant with target service by being obtained in several ways as described in above-described embodiment one
Streaming risk trade data, e.g., user's active reporting during using target service, alternatively, buying or passing through to user
The modes such as exchange obtain.The streaming risk trade data of acquisition can be tried, determine physical presence transaction risk
Streaming transaction data, through the above way by streaming risk trade data it is qualitative after, the streaming risk of target service can be obtained
Transaction data, the streaming risk trade data can be used as black sample flow data.Can by Online Learning algorithms,
According to the above-mentioned first black sample flow data and above-mentioned first white sample flow data and the first black sample flow data and the first white sample
This flow data is repeated the number of sampling, establishes the first decision-tree model.
The concrete processing procedure of above-mentioned steps S314 can be varied, a kind of optional processing mode presented below, tool
Body may refer to the processing of one~step 5 of following step.
Step 1 creates decision-tree model, and establishment inequality condition at split vertexes is waited in the decision-tree model.
In force, as shown in Fig. 2, a new decision-tree model can be created, may include in the decision-tree model
One or more waits for split vertexes, can wait for one group of inequality condition of random establishment at split vertexes in the decision-tree model.
Step 2 carries out son according to above-mentioned inequality condition to the first black sample flow data and the first white sample flow data
The pre- division of node.
In force, it in the decision-tree model of above-mentioned establishment may include one or more child nodes, it can be to decision tree
Child node that may be present is divided in advance in model, can be with the inequality item of above-mentioned establishment during dividing child node
Part is as partitioning standards.
Step 3 calculates the gain of the corresponding inequality condition of pre- division of each child node.
In force, the gain of the corresponding inequality condition of the pre- division of each child node can pass through following formula (1)
It is calculated
Wherein, RjIt represents j-th and waits for split vertexes, behalf s kind inequality conditions, RjlsIt represents j-th and waits for division section
Point uses the child node on the left side after the pre- division of s kind inequality condition, | | the quantity of streaming sample data in node is represented,
RjrsWith RjlsIt is similar, RjrsRepresent the child node on j-th of the right after split vertexes are divided in advance using s kind inequality condition.
In addition,
In practical applications, L (R can also be obtained by other calculation formulaj), L (Rj) it is mainly useful measurement streaming
The purity of the first black sample flow data and the first white sample flow data in sample data.
Step 4 is treated the division that split vertexes carry out child node, is obtained according to above-mentioned gain and corresponding inequality condition
Child node after to division.
In force, after corresponding gain being calculated through the above way, if the stream for waiting for that split vertexes receive
The quantity of style notebook data has reached predetermined threshold (i.e. enough), and there are a kind of inequality condition make it is obtained above
Gain is more than certain threshold value, then the maximum inequality condition of gain can be used to wait for that split vertexes carry out child node and divide to this
It splits, the child node after division.Then, similarly, the iteration that can continue streaming sample data by above-mentioned algorithm, obtains phase
The child node answered, as shown in Figure 2.
Step 5, based on the child node and the first black sample flow data and the first white sample flow data after above-mentioned division
It is repeated the number of sampling, establishes the first decision-tree model.
In force, the child node after decision-tree model division, can be to the first black sample flow data and the first white sample flow
The purity of data is assessed, if the quantity of streaming sample data and the first black sample flow data and the first white sample flow
The purity of data reaches scheduled requirement, then the child node becomes a leaf node, and stops the division of child node.Such as Fig. 2
It is shown, wait for that the quantity of split vertexes or streaming sample data reaches certain number when a decision-tree model has been not present
Amount, decision-tree model structure is completed, to obtain the first decision-tree model.It is then possible to according to the first black sample flow data
It is repeated the number of sampling with the first white sample flow data, separately constitutes multiple training sample flow datas, and establish first respectively
Decision-tree model, to obtain multiple first decision-tree models.
In step S316, selected part data are made from above-mentioned black sample flow data and/or white sample flow data respectively
To assess sample data.
The concrete processing procedure of above-mentioned steps S316 may refer to the related content in above-mentioned steps S310, then this no longer goes to live in the household of one's in-laws on getting married
It states.
It is current to the first decision-tree model and target service respectively according to above-mentioned assessment sample data in step S318
The corresponding decision-tree model of risk prevention system rule assessed.
In force, the processing of S302~step S312 through the above steps, can obtain a decision-tree model, and with
The passage of time, newly-generated decision-tree model can be more and more, newly-generated decision-tree model and current existing decision
Tree-model together constitutes Random Forest model, may include wherein a large amount of decision-tree model in Random Forest model, such as
Including 100 decision-tree models or including 500 decision-tree models etc..Due to including the decision-tree model haveing been friends in the past, new
After decision-tree model generates, the metabolic processes that will necessarily be related between new decision-tree model and old decision-tree model.
In this specification embodiment, the assessment sample data of above-mentioned determination can be utilized (to that is to say and utilize commenting in newest a period of time
Estimate sample data) assessment that accuracy rate is carried out for current existing decision-tree model and newly-generated decision-tree model, to
Obtain the assessed value of the accuracy rate of current existing decision-tree model and newly-generated decision-tree model.
In step s 320, if the first decision-tree model meets scheduled evaluation condition, and the wind that target service is current
There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the corresponding decision-tree model of dangerous prevention and control rule, then uses the
One decision-tree model replaces the second decision-tree model.
Wherein, evaluation condition may be set according to actual conditions, and specific as set a threshold value, evaluation condition can be super
The threshold value etc. is crossed, this specification embodiment does not limit this.
In force, the existence time of current existing decision-tree model can be considered (in general, decision-tree model
Existence time is longer, and the decision-tree model is poorer to the accuracy of the risk prevention system rule finally generated) and it is obtained above
The assessed value of accuracy rate is completed further, it is also possible to add other coherent elements and/or enchancement factor according to actual conditions by new
The replacement of the decision-tree model of generation and current existing decision-tree model, by newly-generated decision-tree model with it is current
The replacement of some decision-tree models, may be implemented to the newly-generated corresponding first risk prevention system rule of decision-tree model with it is current
The update of the corresponding current risk prevention and control rule of existing decision-tree model.
By the replacement of newly-generated decision-tree model and current existing decision-tree model, can obtain new random gloomy
Woods model.Random Forest model for new Random Forest model and before can be continuing with above-mentioned assessment sample data
The assessment of whole accuracy rate, AUC (Area under curve) etc. are carried out, if These parameters (i.e. accuracy rate, AUC etc.)
It is lifted beyond predetermined threshold, then the Random Forest model before new Random Forest model being used to replace.Pass through above-mentioned side
After formula is replaced, gray scale marking can be carried out to data on line, it, can be by mesh if the distribution of results of gray scale marking is normal
The risk of fraud prevention and control model of the operation system of mark business is all switched to new Random Forest model, if the knot of gray scale marking
Fruit abnormal distribution then carries out the rollback processing of Random Forest model.
Wherein, for the processing procedure of above-mentioned marking, the risk of a rule is anti-to be finally all presented as the prevention and control of fraud
Regulatory control then, such as " model point higher than how many, failure is carried out to this transaction " etc., in order to ensure the stabilization of risk prevention system rule
Property (since the adjustment of risk prevention system rule takes time and effort, such as risk prevention system rule always remains as model point then higher than 0.6 point
Failure etc.), but online random forests algorithm changes in real time on line, so needing the operation of a standardization marking.For
Standardization marking is solved the problems, such as, first according to target indicator (including but not limited to accuracy rate etc.), in assessing sample data
Choose a threshold value, the relevant parameter that then normalized needs, can by the threshold map to 0.6 or other numerical value, from
And maintain the stability of risk prevention system rule.
In step S322, it is based on updated decision-tree model, the risk prevention system rule current to target service carries out
Update.
This specification embodiment provides a kind of risk prevention system processing method, the streaming risk recent by obtaining target service
Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow
Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data
It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree
Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service
The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and
White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service
The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed
Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk
Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve
The safety of target service.
Embodiment three
It is the risk prevention system processing method that this specification embodiment provides above, is based on same thinking, this specification is real
It applies example and a kind of risk prevention system processing unit is also provided, as Fig. 4 shows.
The risk prevention system processing unit includes:Sample acquisition module 401, sampling number determining module 402, model foundation mould
Block 403 and model modification module 404, wherein:
Sample acquisition module 401, for obtaining the recent streaming risk trade data of target service as black sample fluxion
According to, and obtain the target service no deal risk streaming transaction data as white sample flow data;
Sampling number determining module 402, for determining that the black sample flow data and the white sample flow data are repeated
The number of sampling
Model building module 403, for according to the black sample flow data and the white sample flow data and described black
Sample flow data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model;
Model modification module 404 uses institute for being based on the black sample flow data and/or the white sample flow data
The first decision-tree model is stated to be updated the corresponding decision-tree model of risk prevention system rule that the target service is current.
In this specification embodiment, described device further includes:
Policy Updates module, for being based on updated decision-tree model, the risk prevention system current to the target service
Rule is updated.
In this specification embodiment, the black sample flow data includes being used to indicate streaming transaction data as streaming risk
The label of transaction data,
The sample acquisition module 401, including:
Feature acquiring unit, the corresponding characteristic variable of streaming transaction data for obtaining target service;
First selection unit, the corresponding spy of streaming transaction data for obtaining no deal risk from the characteristic variable
Variable is levied as white sample flow data;
Matching unit, for the information of the label to be matched with the characteristic variable, the feature to be matched
Variable is as black sample flow data.
In this specification embodiment, described device further includes:
Time delay module, for carrying out delay process to the black sample flow data and the white sample flow data, with determination
The quantitative proportion of the black sample flow data and the white sample flow data.
In this specification embodiment, described device further includes:
Lack sampling module, for carrying out lack sampling processing to the white sample flow data, with the determination white sample fluxion
According to quantitative proportion.
In this specification embodiment, the sampling number determining module 402, including:
Allocation unit, for being respectively that each black sample flow data and each white sample flow data distribute Poisson
The random number of distribution;
Sampling number determination unit is used for the numerical value of the random number according to the Poisson distribution, determines the black sample flow
Data and the white sample flow data are repeated the number of sampling.
In this specification embodiment, the model building module 403, including:
Second selection unit, for choosing the first black sample from the black sample flow data and white sample flow data respectively
Flow data and the first white sample flow data;
Model foundation unit is used for according to the described first black sample flow data and the first white sample flow data, and
The first black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision tree mould
Type.
In this specification embodiment, the model foundation unit is used for:
Decision-tree model is created, establishment inequality condition at split vertexes is waited in the decision-tree model;
According to the inequality condition, son is carried out to the described first black sample flow data and the first white sample flow data
The pre- division of node;
Calculate the gain of the corresponding inequality condition of pre- division of each child node;
The division for waiting for split vertexes and carrying out child node is obtained according to the gain and corresponding inequality condition
Child node after division;
Based on the child node and the first black sample flow data and the first white sample flow data after the division
It is repeated the number of sampling, establishes the first decision-tree model.
In this specification embodiment, the model modification module 404, including:
Third selection unit, for the selected part number from the black sample flow data and/or white sample flow data respectively
According to as assessment sample data;
Assessment unit is used for according to the assessment sample data, respectively to first decision-tree model and the target
The current corresponding decision-tree model of risk prevention system rule of business is assessed;
Model modification unit, if meeting scheduled evaluation condition, and the target for first decision-tree model
There is the second decision tree for being unsatisfactory for scheduled evaluation condition in the current corresponding decision-tree model of risk prevention system rule of business
Model then replaces second decision-tree model using first decision-tree model.
This specification embodiment provides a kind of risk prevention system processing unit, the streaming risk recent by obtaining target service
Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow
Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data
It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree
Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service
The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and
White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service
The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed
Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk
Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve
The safety of target service.
Embodiment five
It is the risk prevention system processing unit that this specification embodiment provides above, is based on same thinking, this specification is real
It applies example and a kind of risk prevention system processing equipment is also provided, as shown in Figure 5.
The risk prevention system processing equipment can be the server or terminal device that above-described embodiment provides.
Risk prevention system processing equipment can generate bigger difference because configuration or performance are different, may include one or one
A above processor 501 and memory 502 can be stored with one or more storage application programs in memory 502
Or data.Wherein, memory 502 can be of short duration storage or persistent storage.Being stored in the application program of memory 502 can wrap
One or more modules (diagram is not shown) are included, each module may include to a series of in risk prevention system processing equipment
Computer executable instructions.Further, processor 501 could be provided as communicating with memory 502, in risk prevention system processing
The series of computation machine executable instruction in memory 502 is executed in equipment.Risk prevention system processing equipment can also include one
Or more than one power supply 503, one or more wired or wireless network interfaces 504, one or more input and output
Interface 505, one or more keyboards 506.
Specifically in the present embodiment, risk prevention system processing equipment includes memory and one or more journey
Sequence, either more than one program is stored in memory and one or more than one program may include one for one of them
Or more than one module, and each module may include refers to executable to the series of computation machine in risk prevention system processing equipment
Enable, and be configured to by one either more than one processor execute this or more than one program include for carry out with
Lower computer executable instructions:
The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the target service
No deal risk streaming transaction data as white sample flow data;
Determine the number that the black sample flow data and the white sample flow data are repetitively sampled;
According to the black sample flow data and the white sample flow data and the black sample flow data and the white sample
This flow data is repeated the number of sampling, establishes the first decision-tree model;
Based on the black sample flow data and/or the white sample flow data, using first decision-tree model to institute
The current corresponding decision-tree model of risk prevention system rule of target service is stated to be updated.
Optionally, the risk prevention system rule current to the target service using first decision-tree model corresponds to
Decision-tree model be updated after, further include:
Based on updated decision-tree model, the risk prevention system rule current to the target service is updated.
Optionally, the black sample flow data includes being used to indicate streaming transaction data as streaming risk trade data
Label,
The recent streaming risk trade data of target service that obtain obtain the target as black sample flow data
The stream data of the no deal risk of business as white sample flow data, including:
Obtain the corresponding characteristic variable of streaming transaction data of target service;
The corresponding characteristic variable of streaming transaction data of no deal risk is obtained from the characteristic variable as white sample
Flow data;
The information of the label is matched with the characteristic variable, the characteristic variable to be matched is as black sample
Flow data.
Optionally, further include:
Delay process is carried out to the black sample flow data and the white sample flow data, with the determination black sample fluxion
According to the quantitative proportion with the white sample flow data.
Optionally, further include:
Lack sampling processing is carried out to the white sample flow data, with the quantitative proportion of the determination white sample flow data.
Optionally, the number that the determination black sample flow data and the white sample flow data are repetitively sampled, packet
It includes:
The random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution;
According to the numerical value of the random number of the Poisson distribution, the black sample flow data and the white sample flow data are determined
It is repeated the number of sampling.
Optionally, described according to the black sample flow data and the white sample flow data and the black sample fluxion
According to the number for being repeated sampling with the white sample flow data, the first decision-tree model is established, including:
The first black sample flow data and the first white sample are chosen from the black sample flow data and white sample flow data respectively
This flow data;
According to the described first black sample flow data and the first white sample flow data and the first black sample fluxion
According to the number for being repeated sampling with the described first white sample flow data, the first decision-tree model is established.
Optionally, described according to the described first black sample flow data and the first white sample flow data and described
One black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model, including:
Decision-tree model is created, establishment inequality condition at split vertexes is waited in the decision-tree model;
According to the inequality condition, son is carried out to the described first black sample flow data and the first white sample flow data
The pre- division of node;
Calculate the gain of the corresponding inequality condition of pre- division of each child node;
The division for waiting for split vertexes and carrying out child node is obtained according to the gain and corresponding inequality condition
Child node after division;
Based on the child node and the first black sample flow data and the first white sample flow data after the division
It is repeated the number of sampling, establishes the first decision-tree model.
Optionally, described to be based on the black sample flow data and/or the white sample flow data, use first decision
Tree-model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current, including:
Selected part data are used as assessment sample number from the black sample flow data and/or white sample flow data respectively
According to;
According to the assessment sample data, the risk current to first decision-tree model and the target service respectively
The corresponding decision-tree model of prevention and control rule is assessed;
If first decision-tree model meets scheduled evaluation condition, and the risk prevention system that the target service is current
There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the corresponding decision-tree model of rule, then uses described first
Decision-tree model replaces second decision-tree model.
This specification embodiment provides a kind of risk prevention system processing equipment, the streaming risk recent by obtaining target service
Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow
Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data
It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree
Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service
The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and
White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service
The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed
Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk
Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve
The safety of target service.
It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims
It is interior.In some cases, the action recorded in detail in the claims or step can be come according to different from the sequence in embodiment
It executes and desired result still may be implemented.In addition, the process described in the accompanying drawings not necessarily require show it is specific suitable
Sequence or consecutive order could realize desired result.In some embodiments, multitasking and parallel processing be also can
With or it may be advantageous.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example,
Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So
And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit.
Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause
This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device
(Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate
Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer
Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker
Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols
Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development,
And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language
(Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL
(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description
Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL
(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby
Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present
Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer
This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages,
The hardware circuit for realizing the logical method flow can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing
The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can
Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit,
ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller includes but not limited to following microcontroller
Device:ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited
Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that in addition to
Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic
Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact
Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it
The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions
For either the software module of implementation method can be the structure in hardware component again.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be used
Think personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
It is any in device, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipment
The combination of equipment.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each unit is realized can in the same or multiple software and or hardware when specification one or more embodiment.
It should be understood by those skilled in the art that, the embodiment of this specification can be provided as method, system or computer journey
Sequence product.Therefore, complete hardware embodiment, complete software embodiment or knot can be used in this specification one or more embodiment
The form of embodiment in terms of conjunction software and hardware.Moreover, this specification one or more embodiment can be used at one or more
A wherein includes computer-usable storage medium (including but not limited to magnetic disk storage, the CD- of computer usable program code
ROM, optical memory etc.) on the form of computer program product implemented.
The embodiment of this specification is with reference to the method, equipment (system) and computer journey according to this specification embodiment
The flowchart and/or the block diagram of sequence product describes.It should be understood that flow chart and/or box can be realized by computer program instructions
The combination of the flow and/or box in each flow and/or block and flowchart and/or the block diagram in figure.This can be provided
A little computer program instructions are to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices
Processor to generate a machine so that pass through the finger that computer or the processor of other programmable data processing devices execute
It enables and generates to specify in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes
The device of function.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to
Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or
The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus
Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
Including so that process, method, commodity or equipment including a series of elements include not only those elements, but also wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described
There is also other identical elements in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of this specification can be provided as method, system or computer program production
Product.Therefore, this specification one or more embodiment can be used complete hardware embodiment, complete software embodiment or combine software
With the form of the embodiment of hardware aspect.Moreover, this specification one or more embodiment can be used it is one or more wherein
The computer-usable storage medium for including computer usable program code (includes but not limited to magnetic disk storage, CD-ROM, light
Learn memory etc.) on the form of computer program product implemented.
This specification one or more embodiment can computer executable instructions it is general on
Described in hereafter, such as program module.Usually, program module includes executing particular task or realization particular abstract data type
Routine, program, object, component, data structure etc..Can also put into practice in a distributed computing environment this specification one or
Multiple embodiments, in these distributed computing environments, by being executed by the connected remote processing devices of communication network
Task.In a distributed computing environment, the local and remote computer that program module can be located at including storage device is deposited
In storage media.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method
Part explanation.
The foregoing is merely the embodiments of this specification, are not limited to this specification.For art technology
For personnel, this specification can have various modifications and variations.It is all this specification spirit and principle within made by it is any
Modification, equivalent replacement, improvement etc., should be included within the right of this specification.