CN108805416A

CN108805416A - A kind of risk prevention system processing method, device and equipment

Info

Publication number: CN108805416A
Application number: CN201810493880.9A
Authority: CN
Inventors: 郦润华; 金宏; 王维强; 赵闻飙
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2018-05-22
Filing date: 2018-05-22
Publication date: 2018-11-13

Abstract

This specification embodiment discloses a kind of risk prevention system processing method, device and equipment, the method includes：The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the target service no deal risk streaming transaction data as white sample flow data；Determine the number that the black sample flow data and the white sample flow data are repetitively sampled；It is repeated the number of sampling according to the black sample flow data and the white sample flow data and the black sample flow data and the white sample flow data, establishes the first decision-tree model；Based on the black sample flow data and/or the white sample flow data, the corresponding decision-tree model of risk prevention system rule that the target service is current is updated using first decision-tree model.

Description

A kind of risk prevention system processing method, device and equipment

Technical field

This specification is related to a kind of field of computer technology more particularly to risk prevention system processing method, device and equipment.

Background technology

More prevalent with network skill and terminal technology, risk present in network trading is also more and more, although net There are risk control rules in the operation systems such as road transaction, and still, there is no therefore reduce, operation system for network trading risk In risk control still suffer from huge challenge.

There would generally be a set of Risk Control System based on risk data in operation system, by constantly manually being adjusted Whole business rule is realized.Strategy operation is indispensable link in Risk Control System, while being also and extraneous risk pair The mode of anti-main path, at present strategy operation is that the mode based on artificial strategy is realized, that is, after finding risk trade, aweather Dangerous control system reports or complains the transaction, by manual analysis, the feature of artificial extraction risk trade data, and by artificial Off-line data assessment is carried out to above-mentioned risk trade data, finally, passes through the corresponding risk prevention system rule of human configuration.

However, obtaining the processing of risk prevention system rule by artificial mode, excessive repeatability, machinery are manually assumed responsibility for The work of property, causes a large amount of human resources and material resources to be wasted, and also need to consume a large amount of time, in current wind In the case that dangerous attacking and defending rhythm is gradually accelerated, risk prevention system is carried out in the above described manner and cannot be satisfied user demand, therefore, is being taken advantage of Equivalent risk prevention and control fields is cheated, a kind of timeliness higher, response more timely solution are needed.

Invention content

The purpose of this specification embodiment is to provide a kind of risk prevention system processing method, device and equipment, to provide one kind More timely risk prevention system handles solution for timeliness higher, response.

To realize that above-mentioned technical proposal, this specification embodiment are realized in：

A kind of risk prevention system processing method that this specification embodiment provides, the method includes：

The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the target service No deal risk streaming transaction data as white sample flow data；

Determine the number that the black sample flow data and the white sample flow data are repetitively sampled；

According to the black sample flow data and the white sample flow data and the black sample flow data and the white sample This flow data is repeated the number of sampling, establishes the first decision-tree model；

Based on the black sample flow data and/or the white sample flow data, using first decision-tree model to institute The current corresponding decision-tree model of risk prevention system rule of target service is stated to be updated.

Optionally, the risk prevention system rule current to the target service using first decision-tree model corresponds to Decision-tree model be updated after, the method further includes：

Based on updated decision-tree model, the risk prevention system rule current to the target service is updated.

Optionally, the black sample flow data includes being used to indicate streaming transaction data as streaming risk trade data Label,

The recent streaming risk trade data of target service that obtain obtain the target as black sample flow data The stream data of the no deal risk of business as white sample flow data, including：

Obtain the corresponding characteristic variable of streaming transaction data of target service；

The corresponding characteristic variable of streaming transaction data of no deal risk is obtained from the characteristic variable as white sample Flow data；

The information of the label is matched with the characteristic variable, the characteristic variable to be matched is as black sample Flow data.

Optionally, the method further includes：

Delay process is carried out to the black sample flow data and the white sample flow data, with the determination black sample fluxion According to the quantitative proportion with the white sample flow data.

Optionally, the method further includes：

Lack sampling processing is carried out to the white sample flow data, with the quantitative proportion of the determination white sample flow data.

Optionally, the number that the determination black sample flow data and the white sample flow data are repetitively sampled, packet It includes：

The random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution；

According to the numerical value of the random number of the Poisson distribution, the black sample flow data and the white sample flow data are determined It is repeated the number of sampling.

Optionally, described according to the black sample flow data and the white sample flow data and the black sample fluxion According to the number for being repeated sampling with the white sample flow data, the first decision-tree model is established, including：

The first black sample flow data and the first white sample are chosen from the black sample flow data and white sample flow data respectively This flow data；

According to the described first black sample flow data and the first white sample flow data and the first black sample fluxion According to the number for being repeated sampling with the described first white sample flow data, the first decision-tree model is established.

Optionally, described according to the described first black sample flow data and the first white sample flow data and described One black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model, including：

Decision-tree model is created, establishment inequality condition at split vertexes is waited in the decision-tree model；

According to the inequality condition, son is carried out to the described first black sample flow data and the first white sample flow data The pre- division of node；

Calculate the gain of the corresponding inequality condition of pre- division of each child node；

The division for waiting for split vertexes and carrying out child node is obtained according to the gain and corresponding inequality condition Child node after division；

Based on the child node and the first black sample flow data and the first white sample flow data after the division It is repeated the number of sampling, establishes the first decision-tree model.

Optionally, described to be based on the black sample flow data and/or the white sample flow data, use first decision Tree-model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current, including：

Selected part data are used as assessment sample number from the black sample flow data and/or white sample flow data respectively According to；

According to the assessment sample data, the risk current to first decision-tree model and the target service respectively The corresponding decision-tree model of prevention and control rule is assessed；

If first decision-tree model meets scheduled evaluation condition, and the risk prevention system that the target service is current There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the corresponding decision-tree model of rule, then uses described first Decision-tree model replaces second decision-tree model.

A kind of risk prevention system processing unit that this specification embodiment provides, described device include：

Sample acquisition module, for obtaining the recent streaming risk trade data of target service as black sample flow data, And obtain the target service no deal risk streaming transaction data as white sample flow data；

Sampling number determining module, for determining that the black sample flow data and the white sample flow data are repetitively sampled Number

Model building module, for according to the black sample flow data and the white sample flow data and the black sample This flow data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model；

Model modification module uses described for being based on the black sample flow data and/or the white sample flow data One decision-tree model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current.

Optionally, described device further includes：

Policy Updates module, for being based on updated decision-tree model, the risk prevention system current to the target service Rule is updated.

The sample acquisition module, including：

Feature acquiring unit, the corresponding characteristic variable of streaming transaction data for obtaining target service；

First selection unit, the corresponding spy of streaming transaction data for obtaining no deal risk from the characteristic variable Variable is levied as white sample flow data；

Matching unit, for the information of the label to be matched with the characteristic variable, the feature to be matched Variable is as black sample flow data.

Optionally, described device further includes：

Time delay module, for carrying out delay process to the black sample flow data and the white sample flow data, with determination The quantitative proportion of the black sample flow data and the white sample flow data.

Optionally, described device further includes：

Lack sampling module, for carrying out lack sampling processing to the white sample flow data, with the determination white sample fluxion According to quantitative proportion.

Optionally, the sampling number determining module, including：

Allocation unit, for being respectively that each black sample flow data and each white sample flow data distribute Poisson The random number of distribution；

Sampling number determination unit is used for the numerical value of the random number according to the Poisson distribution, determines the black sample flow Data and the white sample flow data are repeated the number of sampling.

Optionally, the model building module, including：

Second selection unit, for choosing the first black sample from the black sample flow data and white sample flow data respectively Flow data and the first white sample flow data；

Model foundation unit is used for according to the described first black sample flow data and the first white sample flow data, and The first black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision tree mould Type.

Optionally, the model foundation unit, is used for：

Optionally, the model modification module, including：

Third selection unit, for the selected part number from the black sample flow data and/or white sample flow data respectively According to as assessment sample data；

Assessment unit is used for according to the assessment sample data, respectively to first decision-tree model and the target The current corresponding decision-tree model of risk prevention system rule of business is assessed；

Model modification unit, if meeting scheduled evaluation condition, and the target for first decision-tree model There is the second decision tree for being unsatisfactory for scheduled evaluation condition in the current corresponding decision-tree model of risk prevention system rule of business Model then replaces second decision-tree model using first decision-tree model.

A kind of risk prevention system processing equipment that this specification embodiment provides, the risk prevention system processing equipment include：

Processor；And

It is arranged to the memory of storage computer executable instructions, the executable instruction makes the place when executed Manage device：

The technical solution provided by above this specification embodiment is as it can be seen that this specification embodiment passes through acquisition target service Recent streaming risk trade data obtain the streaming number of deals of the no deal risk of target service as black sample flow data According to as white sample flow data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, Ke Yigen The number of sampling is repeated according to black sample flow data and white sample flow data and black sample flow data and white sample flow data, The first decision-tree model is established, finally, black sample flow data and/or white sample flow data can be based on, use the first decision tree Model is updated the corresponding decision-tree model of risk prevention system rule that target service is current, in this way, passing through the recent of acquisition Black sample flow data and white sample flow data generate the first decision-tree model, and by the first decision-tree model to target service The current corresponding decision-tree model of risk prevention system rule carries out real-time update, without manually participating in that fraud can be completed The update of equivalent risk prevention and control rule, and the update to risk prevention system rule is completed by Recent data, it realizes for subsequently taking advantage of The quick reply for cheating equivalent risk can be greatly decreased the money damage that new risk is brought, improve the life of fraud equivalent risk prevention and control rule At efficiency, and then improve the safety of target service.

Description of the drawings

In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only Some embodiments described in this specification, for those of ordinary skill in the art, in not making the creative labor property Under the premise of, other drawings may also be obtained based on these drawings.

Fig. 1 is a kind of risk prevention system processing method embodiment of this specification；

Fig. 2 is a kind of schematic diagram of decision-tree model of this specification；

Fig. 3 is this specification another kind risk prevention system processing method embodiment；

Fig. 4 is a kind of risk prevention system processing unit embodiment of this specification；

Fig. 5 is a kind of risk prevention system processing equipment embodiment of this specification.

Specific implementation mode

A kind of risk prevention system processing method of this specification embodiment offer, device and equipment.

In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described Embodiment be only this specification a part of the embodiment, instead of all the embodiments.The embodiment of base in this manual, The every other embodiment that those of ordinary skill in the art are obtained without creative efforts, should all belong to The range of this specification protection.

Embodiment one

As shown in Figure 1, this specification embodiment provides a kind of risk prevention system processing method, the executive agent of this method can be with For terminal device or server, wherein the terminal device can such as personal computer equipment, can also as mobile phone, tablet electricity The mobile terminal devices such as brain, the terminal device can be the terminal device that user uses.The server can be independent service Device can also be the server cluster being made of multiple servers, moreover, the server can be the background service of a certain business Device can also be the background server etc. of certain website (such as websites or payment application).This method can be used for pair Business rule in operation system is updated etc. in processing, with executive agent is service in the present embodiment to improve efficiency It illustrates for device, the case where for terminal device, can be handled according to following related contents, details are not described herein.The party Method can specifically include following steps：

In step s 102, the recent streaming risk trade data of target service are obtained as black sample flow data, and are obtained Take the streaming transaction data of the no deal risk of the target service as white sample flow data.

Wherein, target service can be arbitrary business, such as on-line payment business, shopping at network business etc..Streaming risk Transaction data can be during network trading a certain moment get there are the streaming transaction data of transaction risk, In transaction risk may include causing damages to the resource (fund of such as user) of user or causing damages to target service Deng.The streaming risk trade data of scheduled duration before recent streaming risk trade data can be current time, when current Before quarter before scheduled duration such as current time in 24 hours or in 5 days.

It in force, can be by way of blacklist or white list in order to ensure the safety of data in operation system Risk trade that may be present in operation system is carried out the processing such as intercepting.With the continuous improvement of network technology, internet worm Or network trojan horse program etc. is more and more, and blacklist or white list are difficult to cover all possible internet worm or net in time Network trojan horse program or other there are the transaction of risk, in this way, in practical applications, user can encounter blacklist or white list In the transaction risk that can not cover, at this point, user needs to judge whether to the transaction of target service, if the user determine that this There are risks for transaction, then can terminate this transaction, and can store the relevant information of this transaction, or can be by this The relevant information of transaction is sent to the operation system of target service.If user has carried out above-mentioned transaction, and completes above-mentioned friendship Determine that last transaction is risk trade after easily, at this point, user can store the relevant information of this transaction, or can be by this The relevant information of secondary transaction is sent to the operation system of target service.

Can all have a set of Risk Control System based on risk data in operation system, by constantly being manually adjusted Business rule is realized.Strategy operation is indispensable link in Risk Control System, while being also and extraneous risk resisting Main path, the mode of strategy operation at present is the mode based on artificial strategy, and process flow can be：It was found that transaction wind After the risk trade data of danger or risk trade, the transaction is reported or complained to Risk Control System, by manual analysis on this Report or the transaction complained, it is qualitative to be carried out to the transaction, it is then possible to manually extract the feature of risk trade data, and pass through After manually carrying out off-line data assessment to above-mentioned risk trade data, corresponding risk control strategy can be configured to supplement or replace Change the risk prevention system rule in Risk Control System.It is then possible to trial operation risk prevention system rule, if obtained result symbol Expected results are closed, then the risk prevention system rule can be added in the operation system of target service, to be carried out to risk trade Prevention and control.However, obtaining the processing of risk prevention system rule by artificial mode, excessive repeatability, mechanicalness are manually assumed responsibility for Work, not only result in the passage of time, also result in the waste of a large amount of human resources and material resources, moreover, even if from Manually transaction data analysis is started to calculate, the time of above-mentioned a whole set of flow consumption can be long, under normal circumstances all at least As soon as week or more, by taking a kind of new fraud gimmick or risk trade as an example, one week fermentation time is enough to cause largely to provide Damage, may endanger thousands of user.Currently, the form of risk prevention system is very severe, especially cheats scene, due to fraud The capital quantity that scene is related to is very high, and the profit margin for cheating " black production " is very big, so the new method of fraud, new tool layer go out not Thoroughly, the attacking and defending tempo variation for cheating security fields is very fast.On above-mentioned Risk Control System framework, fraudulent trading data Manual analysis needs the response of time, risk prevention system model that the deployment of time, risk control strategy is needed to be also required to the time, to It causes the response for cheating equivalent risk not prompt enough, certain money is caused to damage.Therefore, it in fraud equivalent risk prevention and control field, needs Want a kind of timeliness higher, response more timely solution.For this purpose, this specification embodiment, which provides one kind, being based on newest stream The risk prevention system processing mode of formula data, can specifically include the following contents：

For a certain item business (i.e. target service), may there are some omissions or newly-increased risk in practical applications Transaction, and the related data of these risk trades will not be intercepted by the currently used business rule of target service, so as to It can cause damages to the resource of user, it, can be to the risk prevention system rule of target service in order to promote the safety of target service It is supplemented or is updated in time, can obtain that relevant there are the streamings of transaction risk with target service in several ways thus Risk trade data, for example, user is during using target service, it, can be real-time if it find that there are the transaction of risk It is reported or is complained to corresponding operation system, operation system can receive the risk at the time point of reporting of user or complaint The stream data etc. of transaction can also obtain risk trade data, such as target service otherwise in practical applications Operator can to user buy or by exchange etc. modes obtain user storage or collect there are the correlations of the transaction of risk Stream data etc..Wherein, streaming risk trade data need user's active feedback, need to undergo the regular hour, actually answer In, the streaming risk trade data of scheduled duration before current time can be got, for example, current time was 10 o'clock, The streaming risk trade data in 24 hours 10 o'clock to current times of yesterday or the stream in a longer period of time can then be obtained Formula risk trade data etc..

Reporting of user can be directed to or the risk trade of complaint is tried, determine the stream of wherein physical presence transaction risk Formula data, through the above way by risk trade it is qualitative after, the streaming risk trade data of target service can be obtained, can will The streaming risk trade data are as the black sample flow data in streaming sample data.In addition, in order to which the risk subsequently obtained is anti- Regulatory control is then more accurate, can also include a certain number of white sample flow datas in streaming sample data, white sample flow data can It, can also be with the selection duration of black sample flow data not to choose identical with the black selection duration of sample flow data selection duration Same selection duration moreover, the selection duration of white sample flow data can be more than the selection duration of black sample flow data, such as selects The streaming transaction data etc. of the no deal risk (or transaction or non-transaction reported of non-complaint) before certain number of days is taken, in this way It can ensure as far as possible as few as possible doped with black sample flow data in white sample flow data.

In step S104, the number that above-mentioned black sample flow data and above-mentioned white sample flow data are repetitively sampled is determined.

In force, for the obtained streaming risk trade data (i.e. black sample flow data) of target service and no deal The streaming transaction data (i.e. white sample flow data) of risk can be generated newly by the combination of a certain algorithm or many algorithms Risk prevention system is regular, and online random forests algorithm may be used in this specification embodiment and realize.Online random forests algorithm needs Sample data is carried out that sampling or repeatable sampling (i.e. some sample data can be sampled and using multiple) can be put back to, Since streaming transaction data can only obtain at some time point, it is unable to that all sample numbers can be traversed at any point in time According to therefore, in order to realize the repeatable sampling of convection type transaction data, it is respectively black sample fluxion that can set certain mechanism The number being repetitively sampled is distributed according to white sample flow data, which may be set according to actual conditions, specifically such as, Ke Yisui Machine distribution etc., this specification embodiment does not limit this.

In step s 106, according to above-mentioned black sample flow data and white sample flow data and black sample flow data and in vain Sample flow data is repeated the number of sampling, establishes the first decision-tree model.

Wherein, the first decision-tree model can be it is known it is various happen probability on the basis of, pass through constitute decision The model for probability of the desired value more than or equal to zero for setting to seek net present value (NPV), can be used for assessment item risk or business risk Deng, can be it is a kind of judge its feasibility method of decision analysis.As shown in Fig. 2, the first decision-tree model can be business category A kind of mapping relations model between property and service attribute value, each node in decision tree indicate some service attribute, and every A diverging paths then represent some possible service attribute value, and each leaf node in decision tree then correspond to from root node to this Service attribute value represented by the path that leaf node is undergone.The purpose of first decision-tree model can be concentrated in a data An optimal characteristics are found, a best candidate value is then found from the choosing value of the optimal characteristics, it is optimal according to what is obtained Data set is divided into two Sub Data Sets by candidate value, then the above-mentioned processing procedure of recurrence, until meeting specified requirements.

In force, a decision-tree model can be established by obtained black sample flow data and white sample flow data (i.e. the first decision-tree model), it is then possible to the number of sampling is repeated according to black sample flow data and white sample flow data, point Multiple training sample flow datas are not formed, and establish decision-tree model respectively, to obtain multiple first decision-tree models.

It should be noted that as shown in Fig. 2, the wind based on above-mentioned risk trade data can be exported by decision-tree model Dangerous prevention and control rule, wherein the risk prevention system rule of output may include one or more specific business rules.

In step S108, it is based on above-mentioned black sample flow data and/or above-mentioned white sample flow data, is determined using above-mentioned first Plan tree-model is updated the corresponding decision-tree model of risk prevention system rule that target service is current.

In force, after the processing of S106 obtains the first decision-tree model through the above steps, black sample flow can be used Data and/or white sample flow data, the risk prevention system rule current to each first decision-tree model and target service correspond to Decision-tree model assessed, determine the accuracy rate situation of the corresponding decision-tree model of current risk prevention system rule, if The accuracy rate of the current corresponding decision-tree model of risk prevention system rule meets scheduled accuracy rate condition, can retain current The corresponding decision-tree model of risk prevention system rule, likewise, can also determine the accuracy rate situation of the first decision-tree model, if It meets scheduled accuracy rate condition, then can be by the first decision-tree model and the corresponding decision tree of current risk prevention system rule Model is combined, and obtains the corresponding new decision-tree model of target service, and can determine target industry based on new decision-tree model It is engaged in corresponding risk prevention system rule.If the accuracy rate of the current corresponding decision-tree model of risk prevention system rule is unsatisfactory for making a reservation for Accuracy rate condition, above-mentioned first decision-tree model can be used to replace the corresponding decision tree mould of current risk prevention system rule Type, and the corresponding risk prevention system rule of target service can be determined based on the decision-tree model obtained after replacement.

This specification embodiment provides a kind of risk prevention system processing method, the streaming risk recent by obtaining target service Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve The safety of target service.

Embodiment two

As shown in figure 3, this specification embodiment provides a kind of risk prevention system processing method, the executive agent of this method can be with For terminal device or server, wherein the terminal device can such as personal computer equipment, can also as mobile phone, tablet electricity The mobile terminal devices such as brain, the terminal device can be the terminal device that user uses.The server can be independent service Device can also be the server cluster being made of multiple servers, moreover, the server can be the background service of a certain business Device can also be the background server etc. of certain website (such as websites or payment application).This method can be used for pair Business rule in operation system is updated etc. in processing, with executive agent is service in the present embodiment to improve efficiency It illustrates for device, the case where for terminal device, can be handled according to following related contents, details are not described herein.

In this specification embodiment, mainly for the analysis of above-mentioned artificial data in the related technology, the people of risk prevention system model The multiple portions such as the artificial deployment of work iteration, risk prevention system rule optimize, and are described in detail separately below.This method has Body may comprise steps of：

In step s 302, the corresponding characteristic variable of streaming transaction data of target service is obtained.

Wherein, streaming transaction data may include that there are the streaming risk trade data of transaction risk and no deal risks Streaming transaction data etc., wherein may include the streaming transaction data etc. based on fraud in streaming risk trade data.Feature becomes Amount can be a kind of real value monotropic function of streaming transaction data, characteristic variable may include a variety of ways of realization, for example, with to The form of amount indicates that, alternatively, being indicated in the form of mathematic(al) representation, this specification embodiment does not limit this.

In force, it is contemplated that the streaming risk trade data of target service need to rely on user's active reporting, and report Streaming risk trade data need certain time limit, moreover, the content of streaming transaction data is often more, wherein can carry very much With determine the streaming transaction data content that whether to be streaming risk trade data unrelated, in order to mitigate the processing load of system, Feature extraction can be carried out with convection type transaction data, streaming transaction data is characterized by the form of characteristic variable.Therefore, this theory When each transaction event occurs, the feature that can calculate the corresponding streaming transaction data of the transaction event becomes bright book embodiment Amount, wherein in order to obtain enough characteristic variables, the transaction event occurred in certain time length can be obtained, such as can To obtain the transaction event occurred in 3 months or in half a year, so as to obtain the corresponding streaming transaction of each transaction event Data.The corresponding characteristic variable of streaming transaction data in scheduled duration can be got through the above way.

In step s 304, the corresponding feature of streaming transaction data that no deal risk is obtained from features described above variable becomes Amount is used as white sample flow data.

In force, characteristic variable can be obtained by above-mentioned processing procedure, can be set up based on obtained characteristic variable Sample variable pond.Since characteristic variable is to be obtained based on each transaction event, and transaction event includes in the presence of transaction wind The transaction event of the transaction event and no deal risk of danger, likewise, may include streaming risk trade data in characteristic variable The corresponding characteristic variable of streaming transaction data of corresponding characteristic variable and no deal risk therefore can be from sample variable pond The corresponding characteristic variable of streaming transaction data of middle extraction no deal risk, and can be using the characteristic variable of extraction as white sample Flow data.

In view of reporting of user streaming risk trade data for a user, may more understand streaming risk trade Data there are which kind of transaction risk, which kind of transaction consequence etc. will produce, therefore, user reports streaming risk trade data every time When, streaming risk trade data creating label of the user to upload can be asked, streaming risk friendship can be recorded in the label The risk attributes (such as payment fraud or Telecoms Fraud etc.) of easy data, that is to say streaming risk trade data (i.e. following black samples This flow data) include being used to indicate the label that streaming transaction data is streaming risk trade data, for streaming risk trade The processing of data can be realized by the processing mode of following steps S306.

In step S306, the recent streaming risk trade data of target service are obtained, and by streaming risk trade data In the information of label matched with features described above variable, the characteristic variable to be matched is as black sample flow data.

In force, after the upper transmission/tream type risk trade data of user, which can be tried, It is qualitative to be carried out to the streaming risk trade data, if trial result determines the streaming risk trade data in practical applications Belong to the streaming transaction data there are transaction risk, then can extract the label in the streaming risk trade data, and to the mark Information in label is analyzed, and analysis result is obtained, and then, is in real time constituted the information of the label and features described above variable Sample variable pond is matched, and the characteristic variable in the sample variable pond that can be will match to is as black sample flow data.

It should be noted that the execution of above-mentioned steps S304 and step S306 are realized according to sequencing, in reality In, the processing of above-mentioned steps S304 can be first carried out, then executes the processing of above-mentioned steps S306 again, however, in this theory In another embodiment of bright book, the processing of above-mentioned steps S306 can also be first carried out, then executes above-mentioned steps S304's again Processing, alternatively, the processing of above-mentioned steps S304 and step S306 can also be performed simultaneously, this specification embodiment does not limit this It is fixed.

It, can be in order to obtain timeliness higher, response more timely risk prevention system rule in this specification embodiment Risk prevention system rule is obtained using on-line study (i.e. Online Learning) algorithm, in order to ensure to calculate by on-line study The performance for the risk prevention system model that method obtains, can be according to the concentration of certain streaming sample data to above-mentioned black sample flow data It is handled with above-mentioned white sample flow data, specifically may refer to the processing of following step S308.

In step S308, delay process is carried out to above-mentioned black sample flow data and above-mentioned white sample flow data, with determination The quantitative proportion of black sample flow data and white sample flow data.

Wherein, delay strategy can be for determining whether the quantitative proportion of black sample flow data and white sample flow data is full The data controlling mechanism of sufficient predetermined ratio threshold value, predetermined ratio threshold value therein can be specific such as 1 determines according to actual conditions:1 Or 1:2 etc..

In force, in order to ensure to be input to black sample flow data and white sample flow data in above-mentioned risk prevention system model Ratio is appropriate, can pre-set delay strategy, and black sample flow data and white sample flow can be effectively controlled by delay strategy Input ratio between data, to ensure that black sample flow data and the quantitative proportion of white sample flow data meet predetermined ratio threshold Value.

In addition, for white sample flow data, white sample flow data can also be ensured in streaming sample number otherwise Shared ratio, can specifically be accomplished by the following way in：Lack sampling processing is carried out to above-mentioned white sample flow data, with true The quantitative proportion of the fixed white sample flow data.

Wherein, can to refer to white sample flow data (including white in streaming sample data for the quantitative proportion of white sample flow data Sample flow data and black sample flow data) in shared ratio, specific such as 50% or 30%.Lack sampling processing can be sampling Frequency is less than twice of sampling processing of signal highest frequency.

Lack sampling processing is carried out to above-mentioned white sample flow data by above-mentioned, it is ensured that be input in on-line learning algorithm The concentration of the streaming sample data learnt is not too low.

It can realize that the foundation of decision-tree model and risk prevention system are advised in this specification embodiment by on-line learning algorithm Update then, in practical applications, on-line learning algorithm may include a variety of, can be calculated with on-line study in this specification embodiment Method is illustrates for online random forests algorithm (i.e. Online Random Forests), for passing through other online It practises the concrete processing procedure that algorithm is realized and may refer to following related contents, details are not described herein.

In step S310, the first black sample fluxion is chosen from above-mentioned black sample flow data and white sample flow data respectively According to the first white sample flow data.

In force, above-mentioned black sample flow data can be divided into two parts, a part of black sample flow data can be with For building decision-tree model, the black sample flow data of another part can be used for assessing the accuracy of decision-tree model, Likewise, above-mentioned white sample flow data can also be divided into two parts, the white sample flow data of a part can be used for building Decision-tree model, the white sample flow data of another part can be used for assessing the accuracy of decision-tree model, for this purpose, can be with Pass through scheduled division proportion (such as 5:1 or 7:3 etc.) or the mode randomly selected, is chosen from above-mentioned black sample flow data One black sample flow data, and the first white sample flow data is chosen from above-mentioned white sample flow data.

In practical applications, the processing mode of above-mentioned steps S310 can be varied, provides again below a kind of optional Realization method specifically may refer to following step one and step 2 processing.

Step 1, the random number of respectively each black sample flow data and each white sample flow data distribution Poisson distribution.

In force, according to the characteristic of online random forests algorithm, convection type sample data (including black sample fluxion is needed According to white sample flow data) carry out Bagging operations and (or be referred to as bagging operation, essence can be put back to Sampling operation), still, streaming transaction data can only obtain at some time point, and being unable at any point in time can be all over All sample datas are gone through, therefore, in order to realize the Bagging operations of convection type transaction data, when obtaining every streaming sample number According to, and when use streaming sample data establishment decision-tree model, can first assign one Poisson of every streaming sample data point The random number of cloth, the random number of as each black sample flow data and each white sample flow data distribution Poisson distribution.Wherein, divide The random number for the Poisson distribution matched is nonnegative integer.

Step 2 determines the first black sample flow data and the first white sample according to the numerical value of the random number of above-mentioned Poisson distribution This flow data.

In force, it if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is 0, can incite somebody to action The streaming sample data is set as assessing sample data, so as to subsequently to comment indexs such as the accuracys of decision-tree model Estimate, it, can should be with if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is a positive integer The numerical value of machine number as the streaming sample data " by double sampling " number, to realize for streaming transaction data Bagging is operated.The streaming sample that can will be made of black sample flow data and white sample flow data above-mentioned processing procedure Data are divided into two parts, and a part is assessment sample data, and another part is white by the first black sample flow data and first The streaming sample data of sample flow data composition.

It should be noted that the data bulk in the first black sample flow data can be less than the data in black sample flow data Quantity, the data bulk that can also be equal in black sample flow data, likewise, the data bulk in the first white sample flow data can With the data bulk that less than the data bulk in white sample flow data, can also be equal in white sample flow data.

In step S312, the number that above-mentioned black sample flow data and above-mentioned white sample flow data are repetitively sampled is determined.

The processing of above-mentioned steps S312 can be varied, and a kind of optional processing mode presented below can specifically wrap Include following steps one and step 2.

It, can be only to for each first black sample flow data and each first white it should be noted that be based on the above Sample flow data distributes the random number of Poisson distribution.

Step 2 determines black sample flow data and white sample flow data quilt according to the numerical value of the random number of the Poisson distribution The number of duplicate sampling.

In force, if the numerical value of the random number of the Poisson distribution of certain streaming sample data distribution is one just whole Number, then can be by the numerical value of the random number as number (i.e. streaming sample data of the streaming sample data " by double sampling " It is repeated the number of sampling), it is black so as to obtain first to realize the Bagging operations for streaming transaction data Sample flow data and the first white sample flow data are repeated the number of sampling.

In step S314, according to the above-mentioned first black sample flow data and above-mentioned first white sample flow data and first Black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model.

It in force, can be relevant with target service by being obtained in several ways as described in above-described embodiment one Streaming risk trade data, e.g., user's active reporting during using target service, alternatively, buying or passing through to user The modes such as exchange obtain.The streaming risk trade data of acquisition can be tried, determine physical presence transaction risk Streaming transaction data, through the above way by streaming risk trade data it is qualitative after, the streaming risk of target service can be obtained Transaction data, the streaming risk trade data can be used as black sample flow data.Can by Online Learning algorithms, According to the above-mentioned first black sample flow data and above-mentioned first white sample flow data and the first black sample flow data and the first white sample This flow data is repeated the number of sampling, establishes the first decision-tree model.

The concrete processing procedure of above-mentioned steps S314 can be varied, a kind of optional processing mode presented below, tool Body may refer to the processing of one~step 5 of following step.

Step 1 creates decision-tree model, and establishment inequality condition at split vertexes is waited in the decision-tree model.

In force, as shown in Fig. 2, a new decision-tree model can be created, may include in the decision-tree model One or more waits for split vertexes, can wait for one group of inequality condition of random establishment at split vertexes in the decision-tree model.

Step 2 carries out son according to above-mentioned inequality condition to the first black sample flow data and the first white sample flow data The pre- division of node.

In force, it in the decision-tree model of above-mentioned establishment may include one or more child nodes, it can be to decision tree Child node that may be present is divided in advance in model, can be with the inequality item of above-mentioned establishment during dividing child node Part is as partitioning standards.

Step 3 calculates the gain of the corresponding inequality condition of pre- division of each child node.

In force, the gain of the corresponding inequality condition of the pre- division of each child node can pass through following formula (1) It is calculated

Wherein, R_jIt represents j-th and waits for split vertexes, behalf s kind inequality conditions, R_jlsIt represents j-th and waits for division section Point uses the child node on the left side after the pre- division of s kind inequality condition, | | the quantity of streaming sample data in node is represented, R_jrsWith R_jlsIt is similar, R_jrsRepresent the child node on j-th of the right after split vertexes are divided in advance using s kind inequality condition. In addition,

In practical applications, L (R can also be obtained by other calculation formula_j), L (R_j) it is mainly useful measurement streaming The purity of the first black sample flow data and the first white sample flow data in sample data.

Step 4 is treated the division that split vertexes carry out child node, is obtained according to above-mentioned gain and corresponding inequality condition Child node after to division.

In force, after corresponding gain being calculated through the above way, if the stream for waiting for that split vertexes receive The quantity of style notebook data has reached predetermined threshold (i.e. enough), and there are a kind of inequality condition make it is obtained above Gain is more than certain threshold value, then the maximum inequality condition of gain can be used to wait for that split vertexes carry out child node and divide to this It splits, the child node after division.Then, similarly, the iteration that can continue streaming sample data by above-mentioned algorithm, obtains phase The child node answered, as shown in Figure 2.

Step 5, based on the child node and the first black sample flow data and the first white sample flow data after above-mentioned division It is repeated the number of sampling, establishes the first decision-tree model.

In force, the child node after decision-tree model division, can be to the first black sample flow data and the first white sample flow The purity of data is assessed, if the quantity of streaming sample data and the first black sample flow data and the first white sample flow The purity of data reaches scheduled requirement, then the child node becomes a leaf node, and stops the division of child node.Such as Fig. 2 It is shown, wait for that the quantity of split vertexes or streaming sample data reaches certain number when a decision-tree model has been not present Amount, decision-tree model structure is completed, to obtain the first decision-tree model.It is then possible to according to the first black sample flow data It is repeated the number of sampling with the first white sample flow data, separately constitutes multiple training sample flow datas, and establish first respectively Decision-tree model, to obtain multiple first decision-tree models.

In step S316, selected part data are made from above-mentioned black sample flow data and/or white sample flow data respectively To assess sample data.

The concrete processing procedure of above-mentioned steps S316 may refer to the related content in above-mentioned steps S310, then this no longer goes to live in the household of one's in-laws on getting married It states.

It is current to the first decision-tree model and target service respectively according to above-mentioned assessment sample data in step S318 The corresponding decision-tree model of risk prevention system rule assessed.

In force, the processing of S302~step S312 through the above steps, can obtain a decision-tree model, and with The passage of time, newly-generated decision-tree model can be more and more, newly-generated decision-tree model and current existing decision Tree-model together constitutes Random Forest model, may include wherein a large amount of decision-tree model in Random Forest model, such as Including 100 decision-tree models or including 500 decision-tree models etc..Due to including the decision-tree model haveing been friends in the past, new After decision-tree model generates, the metabolic processes that will necessarily be related between new decision-tree model and old decision-tree model. In this specification embodiment, the assessment sample data of above-mentioned determination can be utilized (to that is to say and utilize commenting in newest a period of time Estimate sample data) assessment that accuracy rate is carried out for current existing decision-tree model and newly-generated decision-tree model, to Obtain the assessed value of the accuracy rate of current existing decision-tree model and newly-generated decision-tree model.

In step s 320, if the first decision-tree model meets scheduled evaluation condition, and the wind that target service is current There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the corresponding decision-tree model of dangerous prevention and control rule, then uses the One decision-tree model replaces the second decision-tree model.

Wherein, evaluation condition may be set according to actual conditions, and specific as set a threshold value, evaluation condition can be super The threshold value etc. is crossed, this specification embodiment does not limit this.

In force, the existence time of current existing decision-tree model can be considered (in general, decision-tree model Existence time is longer, and the decision-tree model is poorer to the accuracy of the risk prevention system rule finally generated) and it is obtained above The assessed value of accuracy rate is completed further, it is also possible to add other coherent elements and/or enchancement factor according to actual conditions by new The replacement of the decision-tree model of generation and current existing decision-tree model, by newly-generated decision-tree model with it is current The replacement of some decision-tree models, may be implemented to the newly-generated corresponding first risk prevention system rule of decision-tree model with it is current The update of the corresponding current risk prevention and control rule of existing decision-tree model.

By the replacement of newly-generated decision-tree model and current existing decision-tree model, can obtain new random gloomy Woods model.Random Forest model for new Random Forest model and before can be continuing with above-mentioned assessment sample data The assessment of whole accuracy rate, AUC (Area under curve) etc. are carried out, if These parameters (i.e. accuracy rate, AUC etc.) It is lifted beyond predetermined threshold, then the Random Forest model before new Random Forest model being used to replace.Pass through above-mentioned side After formula is replaced, gray scale marking can be carried out to data on line, it, can be by mesh if the distribution of results of gray scale marking is normal The risk of fraud prevention and control model of the operation system of mark business is all switched to new Random Forest model, if the knot of gray scale marking Fruit abnormal distribution then carries out the rollback processing of Random Forest model.

Wherein, for the processing procedure of above-mentioned marking, the risk of a rule is anti-to be finally all presented as the prevention and control of fraud Regulatory control then, such as " model point higher than how many, failure is carried out to this transaction " etc., in order to ensure the stabilization of risk prevention system rule Property (since the adjustment of risk prevention system rule takes time and effort, such as risk prevention system rule always remains as model point then higher than 0.6 point Failure etc.), but online random forests algorithm changes in real time on line, so needing the operation of a standardization marking.For Standardization marking is solved the problems, such as, first according to target indicator (including but not limited to accuracy rate etc.), in assessing sample data Choose a threshold value, the relevant parameter that then normalized needs, can by the threshold map to 0.6 or other numerical value, from And maintain the stability of risk prevention system rule.

In step S322, it is based on updated decision-tree model, the risk prevention system rule current to target service carries out Update.

Embodiment three

It is the risk prevention system processing method that this specification embodiment provides above, is based on same thinking, this specification is real It applies example and a kind of risk prevention system processing unit is also provided, as Fig. 4 shows.

The risk prevention system processing unit includes：Sample acquisition module 401, sampling number determining module 402, model foundation mould Block 403 and model modification module 404, wherein：

Sample acquisition module 401, for obtaining the recent streaming risk trade data of target service as black sample fluxion According to, and obtain the target service no deal risk streaming transaction data as white sample flow data；

Sampling number determining module 402, for determining that the black sample flow data and the white sample flow data are repeated The number of sampling

Model building module 403, for according to the black sample flow data and the white sample flow data and described black Sample flow data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model；

Model modification module 404 uses institute for being based on the black sample flow data and/or the white sample flow data The first decision-tree model is stated to be updated the corresponding decision-tree model of risk prevention system rule that the target service is current.

In this specification embodiment, described device further includes：

In this specification embodiment, the black sample flow data includes being used to indicate streaming transaction data as streaming risk The label of transaction data,

The sample acquisition module 401, including：

In this specification embodiment, described device further includes：

In this specification embodiment, the sampling number determining module 402, including：

In this specification embodiment, the model building module 403, including：

In this specification embodiment, the model foundation unit is used for：

In this specification embodiment, the model modification module 404, including：

This specification embodiment provides a kind of risk prevention system processing unit, the streaming risk recent by obtaining target service Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve The safety of target service.

Embodiment five

It is the risk prevention system processing unit that this specification embodiment provides above, is based on same thinking, this specification is real It applies example and a kind of risk prevention system processing equipment is also provided, as shown in Figure 5.

The risk prevention system processing equipment can be the server or terminal device that above-described embodiment provides.

Risk prevention system processing equipment can generate bigger difference because configuration or performance are different, may include one or one A above processor 501 and memory 502 can be stored with one or more storage application programs in memory 502 Or data.Wherein, memory 502 can be of short duration storage or persistent storage.Being stored in the application program of memory 502 can wrap One or more modules (diagram is not shown) are included, each module may include to a series of in risk prevention system processing equipment Computer executable instructions.Further, processor 501 could be provided as communicating with memory 502, in risk prevention system processing The series of computation machine executable instruction in memory 502 is executed in equipment.Risk prevention system processing equipment can also include one Or more than one power supply 503, one or more wired or wireless network interfaces 504, one or more input and output Interface 505, one or more keyboards 506.

Specifically in the present embodiment, risk prevention system processing equipment includes memory and one or more journey Sequence, either more than one program is stored in memory and one or more than one program may include one for one of them Or more than one module, and each module may include refers to executable to the series of computation machine in risk prevention system processing equipment Enable, and be configured to by one either more than one processor execute this or more than one program include for carry out with Lower computer executable instructions：

Optionally, the risk prevention system rule current to the target service using first decision-tree model corresponds to Decision-tree model be updated after, further include：

Optionally, further include：

This specification embodiment provides a kind of risk prevention system processing equipment, the streaming risk recent by obtaining target service Transaction data as black sample flow data, and obtain target service no deal risk streaming transaction data as white sample flow Data, then, it is determined that the number that black sample flow data and white sample flow data are repetitively sampled, it can be according to black sample flow data It is repeated the number of sampling with white sample flow data and black sample flow data and white sample flow data, establishes the first decision tree Model finally can be based on black sample flow data and/or white sample flow data, using the first decision-tree model to target service The corresponding decision-tree model of current risk prevention system rule is updated, in this way, by the recent black sample flow data of acquisition and White sample flow data, generates the first decision-tree model, and pass through the first decision-tree model risk prevention system current to target service The corresponding decision-tree model of rule carries out real-time update, without manually participating in that fraud equivalent risk prevention and control rule can be completed Update, and update to risk prevention system rule is completed by Recent data, realization is for subsequently cheating the quick of equivalent risk Reply can be greatly decreased the money damage that new risk is brought, improve the formation efficiency of fraud equivalent risk prevention and control rule, and then improve The safety of target service.

It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the action recorded in detail in the claims or step can be come according to different from the sequence in embodiment It executes and desired result still may be implemented.In addition, the process described in the accompanying drawings not necessarily require show it is specific suitable Sequence or consecutive order could realize desired result.In some embodiments, multitasking and parallel processing be also can With or it may be advantageous.

In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example, Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit. Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device (Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development, And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language (Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL (Advanced Boolean Expression Language)、AHDL(Altera Hardware Description Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL (Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages, The hardware circuit for realizing the logical method flow can be readily available.

Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller includes but not limited to following microcontroller Device：ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that in addition to Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions For either the software module of implementation method can be the structure in hardware component again.

System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be used Think personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play It is any in device, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipment The combination of equipment.

For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this The function of each unit is realized can in the same or multiple software and or hardware when specification one or more embodiment.

It should be understood by those skilled in the art that, the embodiment of this specification can be provided as method, system or computer journey Sequence product.Therefore, complete hardware embodiment, complete software embodiment or knot can be used in this specification one or more embodiment The form of embodiment in terms of conjunction software and hardware.Moreover, this specification one or more embodiment can be used at one or more A wherein includes computer-usable storage medium (including but not limited to magnetic disk storage, the CD- of computer usable program code ROM, optical memory etc.) on the form of computer program product implemented.

The embodiment of this specification is with reference to the method, equipment (system) and computer journey according to this specification embodiment The flowchart and/or the block diagram of sequence product describes.It should be understood that flow chart and/or box can be realized by computer program instructions The combination of the flow and/or box in each flow and/or block and flowchart and/or the block diagram in figure.This can be provided A little computer program instructions are to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices Processor to generate a machine so that pass through the finger that computer or the processor of other programmable data processing devices execute It enables and generates to specify in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes The device of function.

These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.

In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.

Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.

Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.

It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability Including so that process, method, commodity or equipment including a series of elements include not only those elements, but also wrap Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described There is also other identical elements in the process of element, method, commodity or equipment.

It will be understood by those skilled in the art that the embodiment of this specification can be provided as method, system or computer program production Product.Therefore, this specification one or more embodiment can be used complete hardware embodiment, complete software embodiment or combine software With the form of the embodiment of hardware aspect.Moreover, this specification one or more embodiment can be used it is one or more wherein The computer-usable storage medium for including computer usable program code (includes but not limited to magnetic disk storage, CD-ROM, light Learn memory etc.) on the form of computer program product implemented.

This specification one or more embodiment can computer executable instructions it is general on Described in hereafter, such as program module.Usually, program module includes executing particular task or realization particular abstract data type Routine, program, object, component, data structure etc..Can also put into practice in a distributed computing environment this specification one or Multiple embodiments, in these distributed computing environments, by being executed by the connected remote processing devices of communication network Task.In a distributed computing environment, the local and remote computer that program module can be located at including storage device is deposited In storage media.

Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method Part explanation.

The foregoing is merely the embodiments of this specification, are not limited to this specification.For art technology For personnel, this specification can have various modifications and variations.It is all this specification spirit and principle within made by it is any Modification, equivalent replacement, improvement etc., should be included within the right of this specification.

Claims

1. a kind of risk prevention system processing method, the method includes：

The recent streaming risk trade data of target service are obtained as black sample flow data, and obtain the nothing of the target service The streaming transaction data of transaction risk is as white sample flow data；

According to the black sample flow data and the white sample flow data and the black sample flow data and the white sample flow Data are repeated the number of sampling, establish the first decision-tree model；

Based on the black sample flow data and/or the white sample flow data, using first decision-tree model to the mesh The current corresponding decision-tree model of risk prevention system rule of mark business is updated.

2. according to the method described in claim 1, described current to the target service using first decision-tree model After the corresponding decision-tree model of risk prevention system rule is updated, the method further includes：

3. according to the method described in claim 1, the black sample flow data includes being used to indicate streaming transaction data as stream The label of formula risk trade data,

The recent streaming risk trade data of target service that obtain obtain the target service as black sample flow data No deal risk stream data as white sample flow data, including：

The corresponding characteristic variable of streaming transaction data of no deal risk is obtained from the characteristic variable as white sample fluxion According to；

The information of the label is matched with the characteristic variable, the characteristic variable to be matched is as black sample fluxion According to.

4. according to the method described in claim 1, the method further includes：

Delay process is carried out to the black sample flow data and the white sample flow data, with the determination black sample flow data and The quantitative proportion of the white sample flow data.

5. according to the method described in claim 4, the method further includes：

6. according to the method described in claim 1, the determination black sample flow data and the white sample flow data are weighed The number of second mining sample, including：

According to the numerical value of the random number of the Poisson distribution, determine that the black sample flow data and the white sample flow data are weighed The number sampled again.

7. according to the method described in claim 6, described according to the black sample flow data and the white sample flow data, and The black sample flow data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model, including：

The first black sample flow data and the first white sample flow are chosen from the black sample flow data and white sample flow data respectively Data；

According to the described first black sample flow data and the first white sample flow data and the first black sample flow data and The first white sample flow data is repeated the number of sampling, establishes the first decision-tree model.

8. according to the method described in claim 7, described according to the described first black sample flow data and the first white sample flow Data and the first black sample flow data and the first white sample flow data are repeated the number of sampling, establish first Decision-tree model, including：

According to the inequality condition, child node is carried out to the described first black sample flow data and the first white sample flow data Pre- division；

The division for waiting for split vertexes and carrying out child node is divided according to the gain and corresponding inequality condition Child node afterwards；

Based on after the division child node and the first black sample flow data and the first white sample flow data weighed The number sampled again establishes the first decision-tree model.

9. according to the method described in claim 1, it is described be based on the black sample flow data and/or the white sample flow data, The corresponding decision-tree model of risk prevention system rule that the target service is current is carried out more using first decision-tree model Newly, including：

Selected part data are used as assessment sample data from the black sample flow data and/or white sample flow data respectively；

According to the assessment sample data, the risk prevention system current to first decision-tree model and the target service respectively The corresponding decision-tree model of rule is assessed；

If first decision-tree model meets scheduled evaluation condition, and the risk prevention system rule that the target service is current There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in corresponding decision-tree model, then uses first decision Tree-model replaces second decision-tree model.

10. a kind of risk prevention system processing unit, described device include：

Sample acquisition module for obtaining the recent streaming risk trade data of target service as black sample flow data, and obtains Take the streaming transaction data of the no deal risk of the target service as white sample flow data；

Sampling number determining module, time being repetitively sampled for determining the black sample flow data and the white sample flow data Number

Model building module, for according to the black sample flow data and the white sample flow data and the black sample flow Data and the white sample flow data are repeated the number of sampling, establish the first decision-tree model；

Model modification module is determined for being based on the black sample flow data and/or the white sample flow data using described first Plan tree-model is updated the corresponding decision-tree model of risk prevention system rule that the target service is current.

11. device according to claim 10, described device further include：

Policy Updates module, for being based on updated decision-tree model, to the risk prevention system rule that the target service is current It is updated.

12. device according to claim 10, the black sample flow data includes being used to indicate streaming transaction data to be The label of streaming risk trade data,

The sample acquisition module, including：

First selection unit, the corresponding feature of streaming transaction data for obtaining no deal risk from the characteristic variable become Amount is used as white sample flow data；

Matching unit, for the information of the label to be matched with the characteristic variable, the characteristic variable to be matched As black sample flow data.

13. device according to claim 10, described device further include：

Time delay module, for carrying out delay process to the black sample flow data and the white sample flow data, described in determination The quantitative proportion of black sample flow data and the white sample flow data.

14. device according to claim 13, described device further include：

Lack sampling module, for carrying out lack sampling processing to the white sample flow data, with the determination white sample flow data Quantitative proportion.

15. device according to claim 10, the sampling number determining module, including：

Allocation unit, for being respectively that each black sample flow data and each white sample flow data distribute Poisson distribution Random number；

Sampling number determination unit is used for the numerical value of the random number according to the Poisson distribution, determines the black sample flow data The number of sampling is repeated with the white sample flow data.

16. device according to claim 15, the model building module, including：

Second selection unit, for choosing the first black sample fluxion from the black sample flow data and white sample flow data respectively According to the first white sample flow data；

Model foundation unit, for according to the described first black sample flow data and the first white sample flow data and described First black sample flow data and the first white sample flow data are repeated the number of sampling, establish the first decision-tree model.

17. device according to claim 16, the model foundation unit, are used for：

18. device according to claim 10, the model modification module, including：

Third selection unit, for selected part data to be made from the black sample flow data and/or white sample flow data respectively To assess sample data；

Assessment unit is used for according to the assessment sample data, respectively to first decision-tree model and the target service The current corresponding decision-tree model of risk prevention system rule is assessed；

Model modification unit, if meeting scheduled evaluation condition, and the target service for first decision-tree model There is the second decision-tree model for being unsatisfactory for scheduled evaluation condition in the current corresponding decision-tree model of risk prevention system rule, Then second decision-tree model is replaced using first decision-tree model.

19. a kind of risk prevention system processing equipment, the risk prevention system processing equipment include：

Processor；And

It is arranged to the memory of storage computer executable instructions, the executable instruction makes the processing when executed Device：