Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with this specification.On the contrary, they are only and such as institute
The example of the consistent device and method of some aspects be described in detail in attached claims, this specification.
It is only to be not intended to be limiting this explanation merely for for the purpose of describing particular embodiments in the term that this specification uses
Book.The "an" of used singular, " described " and "the" are also intended to packet in this specification and in the appended claims
Most forms are included, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein is
Refer to and includes that one or more associated any or all of project listed may combine.
It will be appreciated that though various information may be described using term first, second, third, etc. in this specification, but
These information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other out.For example, not taking off
In the case where this specification range, the first information can also be referred to as the second information, and similarly, the second information can also be claimed
For the first information.Depending on context, word as used in this " if " can be construed to " ... when " or
" when ... " or " in response to determination ".
Below can example as shown in connection with fig. 1 introduce a kind of embodiment of conflict rule generation method of this specification, should
Method can apply the equipment (such as server) generated in conflict rule, and this method may comprise steps of:
Step 110:Obtain the variable and variate-value in risk sample.
In one embodiment, the risk sample may include black sample.In general, can by have determined there are wind
The sample of danger is referred to as black sample.
In one embodiment, in order to improve the comprehensive of risk covering as far as possible, the risk sample may include black
Sample and white sample.In general, can by have determined there are the samples of risk to be referred to as black sample, can will have determined
There is no the samples of risk to be referred to as white sample.
In one embodiment, the step 110 obtains the variable and variate-value in risk sample, can specifically include:
Obtain the variable and variate-value in nearest preset duration in risk sample.
In practical applications, in order to promoted generation conflict rule timeliness, can be and obtain in nearest preset duration
The risk sample identified.Since the risk sample in nearest preset duration is closest to the wind in current and following a period of time
Dangerous trend, and the hot spot risk case in current and following one section of event can also be embodied, therefore, it is based on these risk samples
Originally the conflict rule timeliness ultimately generated is higher.The preset duration can be an artificial pre-set empirical value, lead to
It can be often arranged based on practical business demand, such as can be set to 7 days, that is to say, that the risk in available nearest 7 days
Sample.
Step 120:Random combine is carried out between the variable, obtains union variable.
In one implementation, before the step 120, the method also includes:
By the variate-value discretization in the risk sample.
Wherein, the sliding-model control may include that Categorical variable takes classification to handle.It illustrates, it is assumed that have
One variable has 34 provincial region titles such as Zhejiang, Hebei positioning province name, the interior of variable.Thus generate 34 changes
Amount is named positioning province-Zhejiang, positioning province-Hebei etc. respectively.Variate-value can be 0 or 1, for example, if original change
Magnitude is Zhejiang, then corresponding variable positioning province-Zhejiang value is 1, the value of other 33 variables is then 0.
Wherein, the sliding-model control may include that continuous variable divides bin to handle.Class is taken with Categorical variable
Mesh is similar, and continuous variable divides bin to divide bin firstly the need of by continuous variable, for example a variable is the amount of money, can be divided into 0-
100,100-500,500-10000,10000 or more, then generate 4 variables, and the value according to amount of money variable original in sample is true
Which value is 1 in fixed above 4 new variables, and remaining is then 0.
In one embodiment, the mode of the random combine includes but is not limited to combination of two.
In this specification, a kind of fragmentation of data (Dummy Conjunction) scheme is proposed, it can be to first to risk
Variate-value discretization in sample;Then random combine is carried out to single variable each in risk sample, so that obtaining
Union variable may include to obtain the union variable in the presence of conflict to get being contained in the union variable gone out in the presence of conflicting
Union variable also contains the union variable that conflict is not present;The coverage rate for promoting conflict point can so be maximized.Generally
, union variable quantity after combining is bigger, then the coverage rate for the variable that conflicts is higher, and final this can make the conflict generated
Rule is more comprehensive.
In one embodiment, the risk sample can be from the air control model output on line.As previously mentioned, the wind
Control model can be to be run based on conflict rule;In order to promote the recognition performance of air control model, in one embodiment, the method
Can also include:
It is inputted the union variable as new variables in the risk sample;
According to the risk sample comprising the union variable, it is based on model training algorithm, updates the air control model.
Wherein, the model training algorithm may include on-time model training algorithm, such as FTRL, online Random
Forest、LR、XgBoost。
It is noted that model training based on risk sample, generally may include black, white sample.
The embodiment, the business scenario more demanding for timeliness, can be effective by on-time model training method
The timeliness of model training is improved, and can ensure that the lasting normal operation of business on line with the model on online updating line.
Certainly, some application scenarios do not pursue model real-time, then off-line model training algorithm can also be used.
Step 130:Effective union variable is filtered out from the union variable.
In general, the union variable that above-mentioned steps 120 obtain, not may be used to the change for generating conflict rule all
Amount, therefore, it is necessary to filter out effective union variable from these numerous union variables.
Specifically, the step 130 may include:
Based on variable generating algorithm, effective union variable is filtered out from the union variable.
In this specification, the variable generating algorithm can be used for Automatic sieve and select effective union variable.
In one implementation, effective union variable can be determined using L1 regular p enalty.The L1 canonical
Penalty is a kind of algorithm commonly used in the art, is achieved in that the coefficient some variables has become 0, is equivalent to this
Contribution margin is 0 in the model for the effect of variable, it is understood that eliminate these variables, then remaining variable is exactly
Effective variable.In general, can be automatically determined out effectively after user sets screening rule based on L1 regular p enalty algorithm
Union variable.
In one implementation, effective union variable can be determined according to the significance level of union variable.?
The high union variable of significance level is determined as effective union variable.Specifically, the step 130 may include:
Calculate the significance level of variable in the union variable;
When the significance level reaches threshold value, determine that the union variable is effective union variable.
Wherein, the significance level for calculating variable in the union variable, specifically includes:
Obtain the IV value of each variable in the union variable;
The IV value of each variable is added up and obtains the significance level of variable in the union variable.
It, can be with the IV value (Information Value) of reference variable in the embodiment.IV value is that a kind of reflection variable exists
The feature of significance level in model.In general, what a threshold value user can set, the threshold value can be an empirical value.
When the IV value of union variable is greater than the threshold value, illustrate that union variable is important, determines that the union variable is effective combination
Variable;Anyway, when the IV value of union variable is not more than the threshold value, illustrate that union variable is less important, do not determine it as
Effective union variable.
In one implementation, it is trained for online Random Forest or Random Forest algorithm
When model, it is also based on variable importance parameter and determines effective union variable.Due to online Random
Forest Random Forest algorithm itself can export variable importance's as a result, moreover,
Variable importance is also a kind of feature for reflecting the significance level of variable in a model;Therefore it can incite somebody to action
The big union variable of variable importance value is determined as effective union variable.In general, what a user can set
Threshold value illustrates that union variable is important when the variable importance value of union variable is greater than the threshold value, determines
The union variable is effective union variable;Anyway, when the variable importance value of union variable is not more than the threshold value
When, illustrate that union variable is less important, does not determine it as effective union variable.As the variable of union variable
When importance value is greater than the threshold value, illustrate that union variable is important, determines that the union variable is effective union variable;
Anyway, when the variable importance value of union variable is not more than the threshold value, illustrate that union variable is less important,
Effective union variable is not determined it as.
It, can also be using random deep woods variable importance, Stepwise etc., herein no longer one by one in other implementations
It repeats.
The embodiment can quickly filter out effective union variable from numerous union variables automatically, greatly promote
The timeliness of conflict rule.
Step 140:The combination that the variate-value that union variable is filtered out from the effective union variable has conflict becomes
Amount.
Since the effective union variable filtered out is also all not the union variable in the presence of conflict, therefore, it is also desirable to from
There is the union variable of conflict in the variate-value that union variable is filtered out in effective union variable.
For example, below by taking the union variable of combination of two mode as an example, such as there are following effective union variables:
A:Position province and cell-phone number analytically;
B:Analytically and cell-phone number is analytically by wifi;
" and " herein can be a kind of combined mark, such as indicate that union variable is by positioning province and cell-phone number
Analytically form.
There are following several for both union variable combination variate-values:
1, positioning province-Zhejiang and cell-phone number analytically-Hebei;
2, wifi analytically-Hainan and cell-phone number analytically-Zhejiang;
Since positioning province is Zhejiang in 1, cell-phone number is analytically Hebei, and Zhejiang Hebei is different province, therefore
Union variable positions the variate-value of province and cell-phone number analytically and there is conflict, also therefore positions province and cell-phone number analytically
It is the union variable of conflict.
Similarly, due in 2 positioning province be Hainan, cell-phone number is analytically Zhejiang, and Hainan Zhejiang is different province
Part, therefore analytically there is conflict in and cell-phone number variate-value analytically to union variable wifi, also therefore wifi analytically vs hand
Machine number is analytically the union variable of conflict.
Step 150:Conflict rule is generated according to the union variable of the conflict.
An example is continued to use, in the union variable of screening entry/exit conflicts:It, can be with after positioning province and cell-phone number analytically
Generate corresponding conflict rule:Position province vs cell-phone number analytically.
Wherein, " vs " can refer to that a kind of mark of conflict rule, specific conflict rule are according to positioning province and mobile phone
Number analytically determine, for example, conflict rule specifically can be it is analytically inconsistent in the positioning province of some event and cell-phone number
When, show that there are risks for the event.
In one embodiment, the validity of the conflict rule is verified according to the risk sample.
Specifically, risk sample is identified using the conflict rule, counts the discrimination of each conflict rule, institute
Stating discrimination can obtain according to the black sample size/sample size identified.When the discrimination of conflict rule reaches threshold value,
Illustrate that the conflict rule is effectively, can to apply.
In conclusion this specification embodiment, provides a kind of conflict rule generation scheme, by acquisition risk sample
Variable and variate-value;And random combine is carried out to variable, so that the union variable obtained, which may include to obtain, has punching
Prominent union variable can so maximize the coverage rate for promoting conflict point;Then, presence is filtered out from these union variables
There is the union variable of conflict in variate-value;Union variable, that is, real risk of these conflicts;It therefore, can be based on these conflicts
Union variable generates conflict rule.In this way, may be implemented to automatically generate conflict rule, and since these conflict rule are to be based on
What real risk generated, it can be ensured that the validity of conflict rule.
The embodiment of this specification another kind conflict rule generation method introduced below, this method can be applied advises in conflict
The equipment (such as server) then generated, this method may comprise steps of:
A1:Obtain the variable and variate-value in risk sample;
A2:Random combine is carried out to the variable of same type, obtains union variable;
A3:There is the union variable of conflict in the variate-value that union variable is filtered out from the union variable;
A4:Conflict rule is generated according to the union variable of the conflict.
Preferably, can also include:
A5:The validity of the conflict rule is verified according to the risk sample.
Place unlike the embodiments above is that the embodiment is to carry out random combine to the variable of same type, in this way
For above-described embodiment, without filtering out effective union variable in the embodiment, and can be directly from union variable
In filter out variate-value exist conflict union variable;Correspondingly, the load of server can be reduced, promotes processing effect on the whole
Rate.It is noted that the variable of the same type can be preset, including artificially preset:Such as it will be same
Sample indicates that the variable such as cell-phone number of customer position information is analytically set as the variable of same type with positioning province.In the reality having
It applies in example, can be set automatically with computer, such as server is based on machine learning techniques or big data analysis technology certainly
Dynamic setting, it can also be directly from third party's acquisition.
In the embodiment, step A1, the mode of A3, A4, A5 and random combine is same as the previously described embodiments, specific thin
Section can refer to above-described embodiment, no longer be repeated herein.Similar,
Corresponding with aforementioned conflict rule generation method embodiment, this specification additionally provides conflict rule generating means
Embodiment.Described device embodiment can also be realized by software realization by way of hardware or software and hardware combining.
It taking software implementation as an example, is to be deposited by the processor of equipment where it by non-volatile as the device on a logical meaning
Corresponding computer business program instruction is read into memory what operation was formed in reservoir.For hardware view, such as Fig. 2 institute
Show, is a kind of hardware structure diagram of equipment where this specification conflict rule generating means, in addition to processor shown in Fig. 2, net
Except network interface, memory and nonvolatile memory, the equipment in embodiment where device is raw generally according to the conflict rule
At actual functional capability, it can also include other hardware, this is repeated no more.
Fig. 3 is referred to, for the module map for the conflict rule generating means that one embodiment of this specification provides, described device pair
Embodiment illustrated in fig. 1 is answered, described device includes:
Acquiring unit 210 obtains variable and variate-value in risk sample;
Assembled unit 220 carries out random combine between the variable, obtains union variable;
First screening unit 230, filters out effective union variable from the union variable;
There is conflict in the second screening unit 240, the variate-value that union variable is filtered out from the effective union variable
Union variable;
Generation unit 250 generates conflict rule according to the union variable of the conflict.
Optionally, first screening unit 230, specifically includes:
Computation subunit calculates the significance level of variable in the union variable;
It determines subelement, when the significance level reaches threshold value, determines that the union variable is effective union variable.
Optionally, the computation subunit, specifically includes:
Subelement is obtained, the IV value of each variable in the union variable is obtained;
Cumulative subelement, the IV value of each variable is added up and obtains the significance level of variable in the union variable.
Optionally, first screening unit 230, specifically includes:
Based on variable generating algorithm, effective union variable is screened from the union variable.
Optionally, the mode of the random combine includes combination of two.
Optionally, the acquiring unit 210, specifically includes:
Obtain the variable and variate-value in nearest preset duration in risk sample.
Optionally, described device further includes:
Authentication unit verifies the validity of the conflict rule according to the risk sample.
Fig. 4 is referred to, for the module map for the conflict rule generating means that one embodiment of this specification provides, described device pair
Abovementioned steps A1-A4 illustrated embodiment is answered, described device includes:
Acquiring unit 310 obtains variable and variate-value in risk sample;
Assembled unit 320 carries out random combine to the variable of same type, obtains union variable;
There is the union variable of conflict in screening unit 330, the variate-value that union variable is filtered out from the union variable;
Generation unit 340 generates conflict rule according to the union variable of the conflict.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.A kind of typically to realize that equipment is computer, the concrete form of computer can
To be personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
In device, navigation equipment, E-mail receiver/send equipment, game console, tablet computer, wearable device or these equipment
The combination of any several equipment.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus
Realization process, details are not described herein.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality
Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit
The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with
It is not physical unit, it can it is in one place, or may be distributed over multiple network units.It can be according to actual
The purpose for needing to select some or all of the modules therein to realize this specification scheme.Those of ordinary skill in the art are not
In the case where making the creative labor, it can understand and implement.
Figure 3 above describes inner function module and the structural representation of conflict rule generating means, substantial execution
Main body can be a kind of electronic equipment, including:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to:
Obtain the variable and variate-value in risk sample;
Random combine is carried out between the variable, obtains union variable;
Effective union variable is filtered out from the union variable;
There is the union variable of conflict in the variate-value that union variable is filtered out from the effective union variable;
Conflict rule is generated according to the union variable of the conflict.
Optionally, described that effective union variable is screened from the union variable, it specifically includes:
Calculate the significance level of variable in the union variable;
When the significance level reaches threshold value, determine that the union variable is effective union variable.
Optionally, the significance level for calculating variable in the union variable, specifically includes:
Obtain the IV value of each variable in the union variable;
The IV value of each variable is added up and obtains the significance level of variable in the union variable.
Optionally, described that effective union variable is screened from the union variable, it specifically includes:
Based on variable generating algorithm, effective union variable is screened from the union variable.
Optionally, the mode of the random combine includes combination of two.
Optionally, the variable and variate-value obtained in risk sample, specifically includes:
Obtain the variable and variate-value in nearest preset duration in risk sample.
Optionally, further include:
The validity of the conflict rule is verified according to the risk sample.
Figure 4 above describes inner function module and the structural representation of conflict rule generating means, substantial execution
Main body can be a kind of electronic equipment, including:
Processor;
Memory for storage processor executable instruction;
Wherein, the processor is configured to
Obtain the variable and variate-value in risk sample;
Random combine is carried out to the variable of same type, obtains union variable;
There is the union variable of conflict in the variate-value that union variable is filtered out from the union variable;
Conflict rule is generated according to the union variable of the conflict.
In the embodiment of above-mentioned electronic equipment, it should be appreciated that the processor can be central processing unit (English:
Central Processing Unit, referred to as:CPU), it can also be other general processors, digital signal processor (English:
Digital Signal Processor, referred to as:DSP), specific integrated circuit (English:Application Specific
Integrated Circuit, referred to as:ASIC) etc..General processor can be microprocessor or the processor is also possible to
Any conventional processor etc., and memory above-mentioned can be read-only memory (English:Read-only memory, abbreviation:
ROM), random access memory (English:Random access memory, referred to as:RAM), flash memory, hard disk or solid
State hard disk.The step of method in conjunction with disclosed in the embodiment of the present invention, can be embodied directly in hardware processor and execute completion, or
Hardware and software module combination in person's processor execute completion.
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.It is set especially for electronics
For standby embodiment, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to method reality
Apply the part explanation of example.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to this specification
Other embodiments.This specification is intended to cover any variations, uses, or adaptations of this specification, these modifications,
Purposes or adaptive change follow the general principle of this specification and undocumented in the art including this specification
Common knowledge or conventional techniques.The description and examples are only to be considered as illustrative, the true scope of this specification and
Spirit is indicated by the following claims.
It should be understood that this specification is not limited to the precise structure that has been described above and shown in the drawings,
And various modifications and changes may be made without departing from the scope thereof.The range of this specification is only limited by the attached claims
System.