Specific implementation mode
To keep the purpose, technical scheme and advantage of this specification clearer, it is embodied below in conjunction with this specification
Technical scheme is clearly and completely described in example and corresponding attached drawing.Obviously, described embodiment is only this Shen
Please a part of the embodiment, instead of all the embodiments.Based on the embodiment in specification, those of ordinary skill in the art are not having
There is the every other embodiment obtained under the premise of making creative work, shall fall in the protection scope of this application.
Below in conjunction with attached drawing, the technical solution that each embodiment of the application provides is described in detail.
Fig. 1 is a kind of process for air control rule digging that specification embodiment provides, and specifically may include following steps:
S100:For preset each characteristic type, determine that each learning sample corresponds to the characteristic value of this feature type, as
The variable of this feature type.
In this specification one or more embodiment, air control rule digging can be used for excavating carries out risk to various risks
The rule of control, e.g., to swindle transaction prevention and control rule, to the air control rule etc. of loan transaction, for the convenience of description, subsequently
It is illustrated so that the process of the air control rule digging is used to generate the risk rule of identification money laundering as an example.Then, the air control
The process of air control rule digging can by financial institution, supervision department, law enforcement agency etc., participate in mechanism in anti money washing or
Department executes.For example, investigating the equipment such as the terminal of mechanism by the server of bank, economic crime executes the air control rule digging
Process.
For the convenience of description, follow-up this specification by taking the server of bank executes the process of the air control rule digging as an example into
Row explanation.
In addition, since money laundering is mainly by the business (e.g., transaction business) that is executed in financial machine come what is realized, therefore
In this specification, by the air control rule digging process generate air control rule can be according to business datum to money laundering into
The air control rule of row identification, the wherein business datum can be that server executes data required when the service request received.
Then, in order to keep the air control rule of generation more accurate, server can determine each study according to historical data first
Sample.It is directed to preset each characteristic type again, determines that each learning sample corresponds to the characteristic value of this feature type, as this feature
The variable of type, so as to the execution of subsequent step.Specifically, server can be by the industry of several transactions executed in history
Business data, respectively as learning sample.That is, by the business datum of each transaction, as an independent learning sample.Its
In, it may include in the business datum of the transaction:Personal information (e.g., name, age, gender, address, the correspondent party of both parties
Formula etc.), transaction amount, business address Internet protocol (Internet Protocol, IP) of both sides when executing, IP
Location belonging country, the affiliated administrative division of IP address etc..
Further, usually for the needs of anti money washing, financial institution can ask every transaction to carry out anti money washing examination
(e.g., identifying whether to belong to money laundering by preset air control rule), and by examination result addition in the business datum of the transaction.
Wherein, it for being identified as the transaction of money laundering, usually also needs to report international anti money washing tissue, so that in the business datum of the transaction
It is added to and is sanctioned in list with the relevant information of money laundering personnel.Then, server is according to historical data when determining learning sample,
Can be also money laundering by examination result further according to the examination result for money laundering in the business datum of each transaction
Business datum is not the business datum of money laundering as positive example sample, using examination result as negative example sample, as shown in table 1.
Table 1 is the schematic diagram that this specification implements each learning sample that the server provided determines:
Sample identification |
Business datum |
Examination result |
001 |
Initiator:User a;Recipient:User f;IP countries:US;…… |
P |
002 |
Initiator:User b;Recipient:User e;IP countries:RU;…… |
P |
003 |
Initiator:User c;Recipient:User h;IP countries:UA;…… |
N |
004 |
Initiator:User a;Recipient:User i;IP countries:CN;…… |
P |
005 |
Initiator:User d;Recipient:User j;IP countries:UK;…… |
N |
006 |
Initiator:User e;Recipient:User k;IP countries:DE;…… |
N |
Table 1
It can be seen in table 1 that each learning sample corresponds to the business datum of a transaction respectively, also, can be tied according to examining
Fruit is divided into positive example sample and negative example sample.Wherein, examination result indicates that the transaction is money laundering if being identified as P, if
It is identified as N and then indicates it is not money laundering.That is, be labeled as P is positive example sample, what it is labeled as N is negative example sample.For
Facilitate description, the representation method using above-mentioned N and P will be continued in follow-up explanation.
Later, it due to including the feature of money laundering for identification in each business datum in positive example sample, therefore takes
Business device can determine that each learning sample corresponds respectively to this feature type for preset each characteristic type from each learning sample
Characteristic value, as the variable of this feature type, will pass through subsequent step, further determine that for form air control rule it is each
Each specifying variable of specific characteristic type and specific characteristic type.
In the present specification, include multiple business data since every transaction corresponds in business datum, and it is different types of
Business datum is different the effect of identification money laundering, therefore can be using each business datum as a kind of feature class
Type, and by the possible characteristic value of each business datum, as the variable of this feature type, use is determined will pass through subsequent step
(larger characteristic type and each spy e.g., are acted on for identification money laundering in each specific characteristic type for generating air control rule
Larger variable is acted on for identification money laundering in sign type).
For example, it is assumed that usually money laundering be from U.S. somewhere user to Britain's somewhere Client-initiated money transfer transactions, that
IP address belonging country in business datum at this time is the U.S. or Britain in IP address belonging country, merchandises for identification
Business whether be money laundering effect it is larger.Or, it is assumed that usual money laundering is in 4 points of bank's Afternoon Local Time 58 minutes
It initiates, then the exchange hour in business datum and exchange hour are 58 minutes at 4 points in afternoons at this time, merchandises for identification
Business whether be money laundering effect it is larger.
Then, server can be directed to preset each characteristic type, determine that each learning sample corresponds to the spy of this feature type
Value indicative, as the variable of this feature type, so that it is determined that the corresponding each variable of each characteristic type.
Specifically, due to for each financial institution, the business datum of transaction business is its private data, so logical
The business datum of the transaction business often only itself executed in the historical data of each financial institution, and other finance can not be obtained
The business datum of mechanism.Therefore, learning sample is only determined by the historical data of itself, the abundant journey of learning sample may be caused
Degree, the level of coverage deficiency to money laundering, and then the air control rule accuracy rate being subsequently generated is caused to decline.
For example, it is assumed that by the equipment of the same IP address, money laundering is carried out by bank a, and is executed by bank b
Be arm's length dealing.Then when determining learning sample, bank a can determine that the transaction is positive example sample, the feature of the IP address
The feature of positive example sample is also belonged to, and for bank b, money laundering was initiated due to not recording the IP address, then might be used
The learning sample is recorded as negative example sample.
Therefore, in the present specification, server is when determining preset each characteristic type, in addition to dividing all kinds of business datums
It Zuo Wei not also can sanction list according to preconfigured money laundering and determine that learning sample (is specifically as follows just other than each characteristic type
Example sample), and according to the matching result of information and each business datum in money laundering sanction list, it is also used as each characteristic type,
And enrich preset each characteristic type.
Specifically, it can be the letter for having determined as money laundering that international anti money washing tissue is announced that list is sanctioned in the money laundering
Breath, wherein may include that the business datum respectively merchandised for having determined as money laundering (e.g., participates in both parties individual's letter of money laundering
Breath, IP address, exchange hour etc.).Then on the one hand server can sanction the business respectively merchandised that list includes by the money laundering
Data determine each positive example sample.On the other hand, whether the data that each business datum can be sanctioned to list with the money laundering match, and make
For each characteristic type, and using each matching result as variable.
Specifically, its corresponding each business datum and money laundering can be sanctioned list by server to being directed to each learning sample
In information matched, and using the matching result of each business datum as the variable of characteristic type.
For example, being directed to each learning sample, initiator's name in the learning sample and money laundering are sanctioned to each surname in list
Name is matched, and initiator's IP address in the learning sample is matched with each IP address that money laundering is sanctioned in list, by this
Each birthday date that initiator's birthday date and money laundering are sanctioned in list in learning sample match, etc..And according to progress
As a result the variable of the characteristic type of the learning sample, e.g., 0 indicates to mismatch, and 1 indicates matching.
S102:By genetic algorithm, at least part of characteristic type is selected from each characteristic type, as specific characteristic class
Type, and, at least part of variable is selected from each variable of this feature type for each characteristic type, as this feature class
The specifying variable of type.
In the present specification, whether characteristic type different as described in the step s 100 is money laundering for judging transaction
The effect of transaction is different, therefore in order to improve the accuracy for being subsequently generated air control rule, avoids unstable characteristic type pair
The influence of air control rule is generated, server can also determine each specific characteristic class for forming air control rule by genetic algorithm
Each variable of type and each specific characteristic type.Wherein, the unstable characteristic type is in positive example sample and negative example sample
Characteristic type similar in the probability of middle appearance.For example, a certain characteristic type is that most of positive example sample and major part are negative simultaneously
The feature that has of example sample, then this feature for judge to merchandise whether be money laundering effect it is smaller, therefore can pass through
Genetic algorithm optimization filters out.Similarly, it for the variable of each characteristic type, can also be optimized by genetic algorithm
Filter.
Specifically, in the present specification, each learning sample can be considered that a population, each learning sample can be considered one by one
Body, each characteristic type in each learning sample can be considered in the individual that each gene for including, the variable of characteristic type can be considered
The variable of gene.Server can determine each characteristic type and each feature by the optimization process of genetic algorithm as shown in Figure 2
The specifying variable of type, includes the following steps:
S1020:Feature coding is carried out to each learning sample.
First, optimization server can be directed to each learning sample for convenience, be carried out to the characteristic type of the learning sample
Feature coding, in order to subsequently according to feature selecting algorithm, determine the specific characteristic type for forming air control rule.
Specifically, server can be directed to characteristic type Fi, by this feature type FiVariable partitions be three domains:Operator
Domain Oi, codomain ViAnd action scope Ei.Wherein, FiIndicate ith feature type, OiIndicate the operator of i-th of special card type, fortune
Operator may include:" in ", "=" and " not in " indicate include, be equal to and do not include respectively, ViIndicate i-th of special card class
The characteristic value of type, EiIndicate that i-th of special card type whether there is in learning sample, e.g., 0 indicates to be not present, and 1 indicates exist.
It is assumed that by taking the learning sample 001 in table 1 as an example, feature knot as shown in Table 2 is being obtained after feature coding
Structure:
Table 2
Wherein, the Partial Feature type in corresponding learning sample 001 after coding as a result, table 2 shows F1~F44
Feature coding, content are the variable of the corresponding feature coding of each characteristic type.For each learning sample, all it can determine pair
Should learning sample series of features coding.Therefore, it after carrying out feature coding to each characteristic type, can be obtained such as 3 institute of table
The population shown.
Learning sample 1 |
F1 |
F2 |
…… |
Fm |
P |
Learning sample 2 |
F1 |
F2 |
…… |
Fm |
P |
Learning sample 3 |
F1 |
F2 |
…… |
Fm |
N |
…… |
…… |
…… |
…… |
…… |
…… |
Learning sample X |
F1 |
F2 |
…… |
Fm |
P |
Table 3
Wherein, it is seen that share X learning sample, one learning sample of each behavior in table 3, that is, a in genetic algorithm
Body, and P and N is as previously described the examination result to learning sample.
S1021:It is at war with selection to each learning sample, using the learning sample selected as the individual in population.
Server can be to the population, and be at war with selection, and the higher individual of fitness is retained in the population.Specifically
, server can calculate the fitness of the sample to be selected using each learning sample as sample to be selected (that is, individual), it
Afterwards according to calculating to sample respectively to be selected fitness, filter out the sample to be selected of preset quantity as subsequent step
Execute object.For example, according to calculating to sample respectively to be selected fitness sequence from high to low, select preset quantity
Sample to be selected, or according to preset fitness threshold value, fitness is selected to be higher than the sample to be selected of the fitness threshold value, etc.
Deng.Certainly, it for how according to fitness to select sample to be selected, can specifically be configured as needed, this specification is to this
It does not limit.
In addition, in the present specification, each fitness of sample to be selected can be according to each feature in the sample to be selected
The sum of fitness of variable of type determines.And the fitness of the variable of each characteristic type can be calculated according to fitness formula
It arrives.Specifically, the fitness formula can be:fitnessji=Niplog2(Nip/(Nip+10Nin)), wherein fitnessiIt indicates
The fitness of ith feature type, NipIndicate the number that the variable of the ith feature type occurs in each positive example sample, Nin
Indicate the number that the variable of the ith feature type occurs in negative example sample.Then, server can be according to formula:Determine the fitness of each sample to be selected, wherein fitnessjJ-th of sample to be selected of expression
Fitness, fitnessjiIndicate the fitness of the variable of ith feature type in j-th of sample to be selected.
S1022:The population is adjusted, determines at least one new population.
Then, service can be to by the population after previous step screening, executing and replicating operation, crossover operation and variation
At least one of operation etc. operation, to obtain new population.In the present specification, with GkLow K is indicated for population, then G0Table
Show for the first time by the population of fitness screening.In order to facilitate understanding, during being optimized by genetic algorithm, this theory
The sample to be selected that bright book is selected with individual replacement by step S1021, it is each by what is selected with gene substitution characteristic type
The set of sample to be selected is illustrated as a population, the variable of characteristic type for the variable of gene.
Specifically, being illustrated below for each operation:
It is that server carries out random reproduction to the individual in population to replicate operation, obtains new population G'0, wherein it is random multiple
The duplication probability of system can be configured as needed, e.g., 50%.That is, thering is 50% probability to be copied into new population each individual
G'0In.
Crossover operation is server from population G0In individual match two-by-two, and according to crossover probability, determine each pair of individual
Whether crossover operation is executed.Specific crossover operation may include:It exchanges and merges.That is, determining to carry out each of crossover operation
To the syngeneic (that is, characteristic type of the same race) that two are individual in individual, and carry out the exchange between the variable of syngeneic or
Person merges, to obtain new individual.And the new individual obtained after crossover operation will be executed, as new population G "0In individual.
Wherein, the probability of crossover operation can be also configured as needed, e.g., 90%.That is, each pair of individual of server has 90% probability
Carry out crossover operation.
It should be noted that in the present specification, the type of variable may include:The first kind and Second Type.Wherein,
One type can be two-value type, that is, only there are two the variables being worth.Second Type can be:Enumeration type or it is combined in one kind.Specifically
, two-value type variable can be matching result, one kind (e.g., the resident country of initiator, the IP that enumeration type variable can be in plurality of kinds of contents
Address belonging country etc.), combined variable can be one or more in plurality of kinds of contents.In addition, types of variables may also include:
Discrete variable, concretely discrete data (e.g., birthday date, age, the amount of money etc.).Certainly, server also can basis
Need feature being divided into more, this specification does not limit this.
And for crossover operation, the variable that server can be directed in each pair of individual is the syngeneic of two-value type variable
Variable swap, or for it is each pair of individual in variable be enumeration type variable, combined variable syngeneic into change
Amount row merges.Wherein, it is specifically chosen which syngeneic swaps or merges, and can be arranged as required to this specification not
It limits.
For example, it is assumed that for population G shown in Fig. 3 a0In two individuals 002 and 008, server select the two
Body carries out crossover operation and carries out syngeneic exchange to determine new individual.Then, server can first determine the variable in this individual
For the syngeneic of two-value type variable, characteristic type F as shown in fig. 3a3With characteristic type F5Variable be two-value type variable,
And by 002 and 008 corresponding codomain V of individual3And V5It is interchangeable, so that it is determined that two new individuals, as new population
G”0In individual, as shown in Figure 3b.Certainly, also can be the difference spy of two-value type variable to each self-contained variable of individual
The codomain of sign type is interchangeable, e.g., shown in Fig. 3 c.Certainly, when exchanging the codomain of different characteristic type, due to different characteristic
The representation of the variable of type is different, and (e.g., variable is:" US ", " RU ", " DE " etc. indicate that the variable of country and variable are:
“10:10AM”、“9:15PM " etc. indicates that the variable of time, the representation of variable are different), therefore representation in order to prevent
Exchange between different variables leads to newly-generated individual gene deformity, and (it is 10 IP belonging countries e.g., occur:10AM is handed over
It is US easily to initiate the time), server also can determine the identical change of representation according to the representation of the variable of each characteristic type
How exchange between amount is specifically arranged this specification and does not limit this.
Alternatively, when if server merges to determine new individual syngeneic, individual 002 and 008 two can be directed to
Variable is the gene of combined variable, characteristic type F as shown in fig. 3a in individual2With characteristic type F4, by individual 002 He
008 corresponding codomain V2It merges, and by 002 and 008 corresponding threshold value V of individual5It merges, to really
Fixed new individual, as new population G "0In individual, as shown in Figure 3d.
Mutation operation is server from population G0According to mutation probability, the individual into row variation is selected from each individual,
And be adjusted for the action scope of at least one gene (that is, characteristic type) in the individual selected, i.e., by the gene whether
It is present in the individual and is adjusted, to which new individual will be obtained, as new population G "0In individual.Wherein, mutation operation
Select probability can be configured as needed, this specification does not limit, and e.g., may be configured as 10%.Wherein, for each base
Cause, when server by the variable of gene by being not present in being adjusted to be present in the individual in the individual when, can be according in the base
The frequency of occurrences of each variable because in selects a variable as the variable of the gene of the individual after adjustment.Such as, there is frequency in selection
The highest variable of rate, the variable as the gene.
S1023:It is determined next by tournament selection according to the population, and according to the new population that the population generates
For population.
After again, server can be according to regaining population G'0、G”0、G”0And population G0At least one population in sample
This conduct sample to be selected re-starts tournament selection, obtains next-generation population G1.Wherein, the process of selection of being at war with can
Such as abovementioned steps S1021, this specification repeats no more this.
S1024:Whether the iterations for the population that judgment step S1023 is determined reach preset value, if executing step
S1025, if it is not, executing step S1022.
S1025:The population determined is exported as a result.
Finally, server can be repeated the above process by genetic algorithm, until iterations reach preset value, wherein
Preset value can be configured as needed, and this specification does not limit this.For example, preset value is 5000, then when determining
G5000When, which is exported as a result.At this point, in the population it is each individual in include gene be protected by tournament selection
The gene stayed, and by above-mentioned duplication, intersection and mutation operation, reduce the variable in the gene respectively remained
Type.Each characteristic type for including in the population then is the specific characteristic type for generating air control rule, and each specified spy
Each variable of type is levied, is specifying variable.
In addition, in the present specification, server, can be from duplication, intersection and variation during each population iteration
At least one operation is selected to execute in operation.
S104:According to each learning sample, each specified change of each specific characteristic type and specific characteristic type selected
Amount generates air control rule using first order rule learning algorithm.
In this specification embodiment, server determines each specified spy by the feature learnings algorithm such as genetic algorithm
Levy type after, can according in each learning sample for the mark of positive example sample and negative example sample, by first order rule
Algorithm is practised, air control rule is generated.
Specifically, server can redefine each positive example sample and each negative example sample according to each specific characteristic type first
This, that is, list is sanctioned according to the corresponding business datum of each learning sample and money laundering, redefines the feature class of each learning sample
Type.
Later, server can use first order rule learning algorithm (First Order Inductive Learner,
Foil), several air control rules are generated.
Wherein, server, can be according to positive example sample and negative example sample, really when generating air control rule by Foil algorithms
Determine training set.And following steps are executed, as shown in Figure 4:
S1040:Air control regular collection is initialized, and using each positive example sample as positive example sample set, each negative example sample is made
To bear example sample set;
S1041:Judge whether positive example sample set is empty, if so then execute step S1048, if it is not, thening follow the steps
S1042;
S1042:According to Foil algorithms, from each specifying variable, a specifying variable is determined, as newly-built air control rule packet
The specifying variable contained;
S1043:Judge whether create air control rule and the matching degree of negative example sample set is less than pre-determined threshold, if so,
Step S1045 is executed, if it is not, thening follow the steps S1044;
S1044:According to Foil algorithms, from each specifying variable, determines a specifying variable, be added to the newly-built air control
In rule;
S1045:According to the newly-built air control rule after addition specifying variable, updates and bear example sample set, and repeat step
S1043;
S1046:The newly-built air control rule is added in air control regular collection;
S1047:According to air control regular collection, delete regular with the air control in the air control regular collection in positive example sample set
Matched positive example sample, and repeat S1041;
S1048:Until judging positive example sample set for sky, by the air control rule in air control regular collection, as generation
Several air controls rule.
Wherein, in step S1043, formula can be used in server:
Determine the gain of each specifying variable
Value, and by the highest specifying variable of yield value, as the specifying variable for being added to air control rule.Wherein, PnewExpression specifies this
After newly-built air control rule is added in variable, the quantity with the positive example sample for creating air control rule match, NnewIt indicates the specified change
After newly-built air control rule is added in amount, the quantity with the negative example sample for creating air control rule match, PoldIt indicates the specified change not
When newly-built air control rule is added in amount, the quantity with the positive example sample for creating air control rule match, NoldIt indicates the specified change not
When newly-built air control rule is added in amount, the quantity with the negative example sample for creating air control rule match.
It should be noted that in the present specification, specifying variable being added in air control rule, is concretely specified this
The corresponding specific characteristic type of variable, is added in air control rule.For example, specific characteristic type is IP belonging countries, specifies and become
Amount may include:" US ", " CN ", " UK " etc., when server selects specifying variable to be added to wind for " US " and by the specifying variable
Regulatory control then when, can be specifically by IP belonging countries be US be added in air control rule.Then, each wind generated by specification
Regulatory control then, can be considered the air control rule being made of different condition, when each condition coupling for having business datum and any air control rule
When, it may be determined that the corresponding gel coat of the business datum is money laundering.
For example, it is assumed that generate certain air control rule be initiate transaction IP belonging to cross be US, promoter's name be user a, then
When the transaction that user a is initiated from the U.S., it may be determined that the transaction is money laundering.
In addition, in the present specification, creating the matching degree of air control rule and negative example sample set, air control rule is created with this
The quantity of matched positive example sample, and the ratio-dependent with the quantity of the negative example sample for creating air control rule match.For example,
Assuming that some newly-built air control rule and 1000 positive example sample matches, and with 1 negative example sample matches, then newly-built air control rule
Matching degree with negative example sample set is 0.1%.
Certainly, in the present specification, pre-determined threshold can be configured as needed, alternatively, server is judging to create wind
When whether regulatory control is then higher than pre-determined threshold with the matching degree of negative example sample set, it can may be alternatively provided as judging newly-built air control rule
The whether not matching with any negative example sample set.This specification does not limit this.
Based on air control rule digging process shown in FIG. 1, since group becomes a common practice each specific characteristic type and its correspondence of rule
Each specifying variable, determined according to feature selecting algorithm screening and optimizing, thus based on each specific characteristic type generate
Air control rule recognition effect it is more preferable.The avoidable prior art be manually arranged rule disadvantage, improve according to air control rule into
Efficiency when row anti money washing and recognition accuracy.
In addition, in the present specification, server after generating several air control rules, can also to the air control rule of generation into
Travel rule is trimmed, to further increase the recognition accuracy of air control rule.
Specifically, server can redefine the detection sample different from learning sample, each sample that detects can be according to going through
History data determine.Later, according to several air controls of generation rule, each detection sample is identified, determines recognition result.So
Afterwards, according to the examination result of existing each detection sample, the accuracy of the recognition result of several air controls rule generated is determined.
Specifically, it is possible, firstly, to be directed to the recognition result of each air control rule, determining each air control rule is accurate respectively
Degree.Later, for each air control rule, server can be to the specified change for each specific characteristic type for including in the air control rule
Whether amount judges to delete the accuracy of the recognition result of air control rule after the specifying variable higher than before the deletion specifying variable
The recognition result accuracy of air control rule, if so, the specifying variable of the specific characteristic type is deleted, if otherwise not deleting.Most
Afterwards, after server completes the trimming to all air controls rule, then the recognition result of the air control rule after each trimming is recalculated
Accuracy select the air control rule of specified quantity as washing for identification and according to the accuracy of the air control rule after trimming
The air control rule of money transaction.
Further, above-described embodiment only illustrates book by taking the air control rule of generation for identification money laundering as an example,
Similarly, the air control rule digging method that this specification provides can also be directed to other types of service, air control rule be determined, such as in step
Described in rapid S100, this specification does not limit this.
It should be noted that the executive agent of each step of this specification embodiment institute providing method may each be same and set
It is standby, alternatively, this method is also by distinct device as executive agent.For example, the executive agent of step S100 and step S102 can be with
Executive agent for equipment 1, step S102 can be equipment 2;Alternatively, the executive agent of step S100 can be equipment 1, step
The executive agent of S102 and step S104 can be equipment 2;Etc..It is above-mentioned that this specification specific embodiment is described.
Other embodiments are within the scope of the appended claims.In some cases, the action recorded in detail in the claims or step
It suddenly can be according to being executed different from the sequence in embodiment and desired result still may be implemented.In addition, in the accompanying drawings
The process of description, which not necessarily requires the particular order shown or consecutive order, could realize desired result.In certain embodiment party
In formula, multitasking and parallel processing is also possible or it may be advantageous.
Based on the method for air control rule digging shown in FIG. 1, this specification embodiment also provides a kind of air control rule digging
Device, as shown in Figure 5.
Fig. 5 is a kind of structural schematic diagram for air control rule digging device that this specification embodiment provides, described device packet
It includes:
Determining module 200 determines that each learning sample corresponds to the feature of this feature type for preset each characteristic type
Value, the variable as this feature type;
Selecting module 202 selects at least part of characteristic type, as finger by genetic algorithm from each characteristic type
Determine characteristic type, and, at least part of variable is selected from each variable of this feature type for each characteristic type, as
The specifying variable of this feature type;
Generation module 204, according to each of each learning sample, each specific characteristic type selected and specific characteristic type
Specifying variable generates air control rule using first order rule learning algorithm.
The selecting module 202 selects learning sample according to the duplication probability of the genetic algorithm from each learning sample
It is replicated, obtains each reproduction copies, the variable of each characteristic type and each reproduction copies pair are corresponded to according to each learning sample
Specific characteristic type and specifying variable should be selected in the variable of each characteristic type.
The selecting module 202, according to the crossover probability of the genetic algorithm, from each learning sample, it is several right to determine
Learning sample swaps the variable to learning sample characteristic type of the same race for each pair of learning sample, and/or, to this
The variable of learning sample characteristic type of the same race is merged, cross sample is obtained, each feature is corresponded to according to each learning sample
The variable of type and each cross sample correspond to the variable of each characteristic type, select specific characteristic type and specifying variable.
The type of the variable includes:The first kind and Second Type, the selecting module 202, to this to learning sample
Middle variable is that the variable of the characteristic type of the same race of the first kind swaps, and is Second Type to variable in learning sample to this
The variable of characteristic type of the same race merges.
The selecting module 202 is selected from each learning sample according to the mutation probability of the genetic algorithm into row variation
Learning sample, for each learning sample selected into row variation, the change at least one characteristic type in the learning sample
Amount obtains each variation sample with the presence or absence of being adjusted in the learning sample, corresponds to each feature class according to each learning sample
The variable of type and each variation sample correspond to the variable of each characteristic type, select specific characteristic type and specifying variable.
The selecting module 202, for each characteristic type, when by the variable of this feature type by being not present in the study
When being adjusted to be present in the learning sample in sample, a change is selected according to the frequency of occurrences of each variable in this feature type
Amount, the variable as this feature type of the learning sample after adjustment.
The selecting module 202, according to the fitness formula of the genetic algorithmThe fitness for determining sample respectively to be selected, according to each to be selected
The sequence of sample fitness from high to low is selected, the sample to be selected of specified quantity is selected, the sample to be selected selected is corresponded to
Each characteristic type, as specific characteristic type, and by the sample to be selected selected correspond to each characteristic type variable,
As specifying variable;
Wherein, the sample to be selected includes:Each learning sample, each reproduction copies, each cross sample and respectively become abnormal
At least a kind of sample in this, fitnessjIndicate the fitness of j-th of sample to be selected, fitnessjiIndicate j-th it is to be selected
Select the fitness of the variable of ith feature type in sample, NipIndicate the variable of the ith feature type in each positive example sample
Appearance number, NinIndicate that the number that the variable of the ith feature type occurs in each negative example sample, positive example sample are wind
Control result is risky sample to be selected, and negative example sample is that air control result is not have risky sample to be selected according to the something lost
The fitness formula of propagation algorithmDetermine sample respectively to be selected
Fitness selects the sample to be selected of specified quantity, will select according to the respectively sequence of sample fitness to be selected from high to low
The corresponding each characteristic type of sample to be selected, correspond to as specific characteristic type, and by the sample to be selected selected
The variable of each characteristic type, as specifying variable;
Wherein, the sample to be selected includes:Each learning sample, each reproduction copies, each cross sample and respectively become abnormal
At least a kind of sample in this, fitnessjIndicate the fitness of j-th of sample to be selected, fitnessjiIndicate j-th it is to be selected
Select the fitness of the variable of ith feature type in sample, NipIndicate the variable of the ith feature type in each positive example sample
Appearance number, NinIndicate that the number that the variable of the ith feature type occurs in each negative example sample, positive example sample are wind
Control result is risky sample to be selected, and negative example sample is that air control result is not have risky sample to be selected.
Described device further includes:
Rule trimming module 206 determines that the recognition accuracy of each air control rule, the identification according to each air control rule are accurate
Rate is adjusted the characteristic type for including in each air control rule, redefines the recognition accuracy of the air control rule after adjustment,
According to the recognition accuracy of each air control rule before the recognition accuracy of each air control rule after adjustment and adjustment, selection is at least
One air control rule.
The rule trimming module 206 determines each detection sample different from each learning sample, root according to historical data
According to each air control rule to the recognition result of each detection sample, the recognition accuracy of each air control rule is determined.
The rule trimming module 206, for each air control rule, according to the recognition accuracy of air control rule, from group
In each characteristic type at air control rule, determine makes the feature class that the recognition accuracy of air control rule improves after being deleted
Type deletes the characteristic type determined from each characteristic type of composition air control rule.
Based on the method for the air control rule digging described in Fig. 1, this specification, which corresponds to, provides a kind of server, as shown in fig. 6,
Wherein, the server includes:One or more processors and memory, memory have program stored therein, and be configured to by
One or more processors execute following steps:
For preset each characteristic type, determine that each learning sample corresponds to the characteristic value of this feature type, as the spy
Levy the variable of type;
By genetic algorithm, at least part of characteristic type is selected from each characteristic type, as specific characteristic type, with
And at least part of variable is selected from each variable of this feature type for each characteristic type, as this feature type
Specifying variable;
According to each learning sample, each specifying variable of each specific characteristic type and specific characteristic type selected, adopt
With first order rule learning algorithm, air control rule is generated.
In the 1990s, the improvement of a technology can be distinguished clearly be on hardware improvement (for example,
Improvement to circuit structures such as diode, transistor, switches) or software on improvement (improvement for method flow).So
And with the development of technology, the improvement of current many method flows can be considered as directly improving for hardware circuit.
Designer nearly all obtains corresponding hardware circuit by the way that improved method flow to be programmed into hardware circuit.Cause
This, it cannot be said that the improvement of a method flow cannot be realized with hardware entities module.For example, programmable logic device
(Programmable Logic Device, PLD) (such as field programmable gate array (Field Programmable Gate
Array, FPGA)) it is exactly such a integrated circuit, logic function determines device programming by user.By designer
Voluntarily programming comes a digital display circuit " integrated " on a piece of PLD, designs and makes without asking chip maker
Dedicated IC chip.Moreover, nowadays, substitution manually makes IC chip, this programming is also used instead mostly " patrols
Volume compiler (logic compiler) " software realizes that software compiler used is similar when it writes with program development,
And the source code before compiling also write by handy specific programming language, this is referred to as hardware description language
(Hardware Description Language, HDL), and HDL is also not only a kind of, but there are many kind, such as ABEL
(Advanced Boolean Expression Language)、AHDL(Altera Hardware Description
Language)、Confluence、CUPL(Cornell University Programming Language)、HDCal、JHDL
(Java Hardware Description Language)、Lava、Lola、MyHDL、PALASM、RHDL(Ruby
Hardware Description Language) etc., VHDL (Very-High-Speed are most generally used at present
Integrated Circuit Hardware Description Language) and Verilog.Those skilled in the art also answer
This understands, it is only necessary to method flow slightly programming in logic and is programmed into integrated circuit with above-mentioned several hardware description languages,
The hardware circuit for realizing the logical method flow can be readily available.
Controller can be implemented in any suitable manner, for example, controller can take such as microprocessor or processing
The computer for the computer readable program code (such as software or firmware) that device and storage can be executed by (micro-) processor can
Read medium, logic gate, switch, application-specific integrated circuit (Application Specific Integrated Circuit,
ASIC), the form of programmable logic controller (PLC) and embedded microcontroller, the example of controller includes but not limited to following microcontroller
Device:ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, are deposited
Memory controller is also implemented as a part for the control logic of memory.It is also known in the art that in addition to
Pure computer readable program code mode is realized other than controller, can be made completely by the way that method and step is carried out programming in logic
Controller is obtained in the form of logic gate, switch, application-specific integrated circuit, programmable logic controller (PLC) and embedded microcontroller etc. to come in fact
Existing identical function.Therefore this controller is considered a kind of hardware component, and to including for realizing various in it
The device of function can also be considered as the structure in hardware component.Or even, it can will be regarded for realizing the device of various functions
For either the software module of implementation method can be the structure in hardware component again.
System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity,
Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be used
Think personal computer, laptop computer, cellular phone, camera phone, smart phone, personal digital assistant, media play
It is any in device, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or these equipment
The combination of equipment.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this
The function of each unit is realized can in the same or multiple software and or hardware when application.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, the present invention can be used in one or more wherein include computer usable program code computer
The computer program production implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
The form of product.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that can be realized by computer program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided
Instruct the processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine so that the instruction executed by computer or the processor of other programmable data processing devices is generated for real
The device for the function of being specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works so that instruction generation stored in the computer readable memory includes referring to
Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device so that count
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, in computer or
The instruction executed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or
The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology realizes information storage.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, tape magnetic disk storage or other magnetic storage apparatus
Or any other non-transmission medium, it can be used for storage and can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
Including so that process, method, commodity or equipment including a series of elements include not only those elements, but also wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that wanted including described
There is also other identical elements in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the application
Form.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The application can describe in the general context of computer-executable instructions executed by a computer, such as program
Module.Usually, program module includes routines performing specific tasks or implementing specific abstract data types, program, object, group
Part, data structure etc..The application can also be put into practice in a distributed computing environment, in these distributed computing environments, by
Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with
In the local and remote computer storage media including storage device.
Each embodiment in this specification is described in a progressive manner, identical similar portion between each embodiment
Point just to refer each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so description is fairly simple, related place is referring to embodiment of the method
Part explanation.
Above is only an example of the present application, it is not intended to limit this application.For those skilled in the art
For, the application can have various modifications and variations.It is all within spirit herein and principle made by any modification, equivalent
Replace, improve etc., it should be included within the scope of claims hereof.