CN110348669B - Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium - Google Patents

Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium Download PDF

Info

Publication number
CN110348669B
CN110348669B CN201910433971.8A CN201910433971A CN110348669B CN 110348669 B CN110348669 B CN 110348669B CN 201910433971 A CN201910433971 A CN 201910433971A CN 110348669 B CN110348669 B CN 110348669B
Authority
CN
China
Prior art keywords
factors
factor
rule
preset
available
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910433971.8A
Other languages
Chinese (zh)
Other versions
CN110348669A (en
Inventor
杨添坤
尹钏
刘金萍
钱建
王鸿
郑永耀
林峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Property and Casualty Insurance Company of China Ltd
Original Assignee
Ping An Property and Casualty Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Property and Casualty Insurance Company of China Ltd filed Critical Ping An Property and Casualty Insurance Company of China Ltd
Priority to CN201910433971.8A priority Critical patent/CN110348669B/en
Publication of CN110348669A publication Critical patent/CN110348669A/en
Application granted granted Critical
Publication of CN110348669B publication Critical patent/CN110348669B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Abstract

The embodiment of the application belongs to the technical field of big data, and relates to an intelligent rule generation method, which comprises the steps of carrying out field slicing processing on collected service data to extract slice fields, wherein the slice fields conform to field definitions of a field query library of a flow link corresponding to the service data; constructing candidate factors according to the values of the extracted slice fields, and writing the candidate factors into a factor feature library; evaluating candidate factors in the factor feature library according to a preset evaluation function, and selecting available factors from the candidate factors according to an evaluation result; and combining the selected available factors according to a preset combination algorithm to generate a new generation rule. The application also provides an intelligent rule generating device, computer equipment and a storage medium. The application realizes the automatic generation of the business rules, reduces the workload, improves the working efficiency, does not depend on human experience, and improves the accuracy of the rules.

Description

Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to an intelligent rule generating method, an intelligent rule generating device, a computer device, and a storage medium.
Background
With the increase of the informatization degree of enterprises, many business rules generated in the business of the enterprises can be executed by a rule engine so as to meet the flexible and rapid business requirements of the enterprises. Therefore, the writing, management and application of rule factors become a major concern for enterprises.
In the prior art, most of the production modes of the rule factors are that business personnel firstly put forward business rules, then technicians write the rules, and then a rule engine is imported to execute the rules. With the continuous development of the service, the service rules are changed and increased, and the service rules depend on human experience in the prior art, so that the method has the advantages of higher subjectivity, poorer risk adaptability, large workload and low efficiency
Disclosure of Invention
The embodiment of the application aims to provide an intelligent rule generation method, an intelligent rule generation device, computer equipment and a storage medium, so as to solve the problems that business rules in the prior art depend on human experience too, and the method is high in subjectivity, poor in risk adaptability, large in workload and low in efficiency.
In order to solve the above technical problems, an embodiment of the present application provides an intelligent rule generating method, including the following steps:
performing field slicing processing on the acquired service data to extract slice fields, wherein the slice fields conform to field definitions of a field query library of a flow link corresponding to the service data;
constructing candidate factors according to the values of the extracted slice fields, and writing the candidate factors into a factor feature library;
evaluating candidate factors in the factor feature library according to a preset evaluation function, and selecting available factors from the candidate factors according to an evaluation result;
and combining the selected available factors according to a preset combination algorithm to generate a new generation rule.
Further, the step of constructing candidate factors according to the values of the extracted slice fields and writing the candidate factors into a factor feature library includes:
storing the value of the slice field as an initial factor into a designated data platform;
preprocessing the initial factors in the data platform according to a preset preprocessing rule;
calculating the preprocessed initial factors according to a preset factor construction algorithm to obtain candidate factors;
and writing the candidate factors into a factor feature library according to a preset character format.
Further, the step of evaluating the candidate factors in the factor feature library according to a preset evaluation function, and selecting the available factors from the candidate factors according to the evaluation result includes:
and evaluating the initial factor by using a preset evaluation function, wherein the formula is as follows:
selecting candidate factors with the evaluation score reaching a preset value as available factors;
wherein Q is the evaluation score, p is the number of covered positive examples of the evaluated factor, N is the number of covered negative examples of the evaluated factor, Q is the number of covered positive examples of the whole sample, N is the number of covered negative examples of the whole sample, W is the coverage weight, and the value is a preset value of 0< W < 1.
Further, the method is characterized in that after the step of evaluating the candidate factors in the factor feature library according to a preset evaluation function and selecting the available factors from the candidate factors according to the evaluation result, the method further comprises:
judging whether an available factor with the evaluation score reaching the preset value is selected or not;
if yes, triggering the step of combining the selected available factors according to a preset combination algorithm to generate a new generation rule, otherwise, reducing the number of covered positive examples of the samples or the evaluated factors;
and judging whether the number of the covered positive examples after the reduction is lower than a preset threshold value, if so, determining that the generation of the rule fails, exiting the flow of the generation of the rule, otherwise triggering the step of selecting the available factors from the candidate factors according to the evaluation result.
Further, the step of combining the selected available factors according to a preset combination algorithm to generate a new generation rule includes:
forming the selected available factors into an available factor list;
combining the available factors in the available factor list by using a preset greedy search and a distributed algorithm to obtain a plurality of rules;
pruning the rules by utilizing a pruning function, and extracting the rule with the maximum pruning function value as a new generation rule.
Further, the step of pruning the plurality of rules with a pruning function includes:
calculating the pruning function of the original rule and the rule with each factor deleted by using the pruning function F= (p-n)/(p+n);
where p is the number of covered positive examples of the evaluated factor, and n is the number of covered negative examples of the evaluated factor.
Further, the step of performing field slicing processing on the collected service data to extract a slice field includes:
acquiring service data;
creating an asynchronous slicing task of the service data;
the asynchronous slicing task is called, and a field query library corresponding to query is determined according to the business links where the business data related to the asynchronous slicing task are located;
and extracting slice fields which logically conform to field definitions in the field query library from the service data by using a Java reflection mechanism.
In order to solve the above technical problems, the embodiment of the present application further provides an intelligent rule generating device, which adopts the following technical scheme:
the intelligent rule generating apparatus includes:
the feature library module is used for constructing candidate factors according to the values of the extracted slice fields and writing the candidate factors into a factor feature library;
the factor selection module is used for evaluating candidate factors in the factor feature library according to a preset evaluation function and selecting available factors from the candidate factors according to an evaluation result;
and the rule generation module is used for generating rules by combining the selected available factors.
In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:
comprising a memory in which a computer program is stored, and a processor which, when executing the computer program, implements the steps of the intelligent rule generation method as described above.
In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:
the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the intelligent rule generation method as described above.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
the method and the device realize automatic generation of the business rules, reduce workload, improve working efficiency, and improve rule accuracy without relying on human experience. And the self-learning self-adaption of rules can be realized, the threshold value set by the rules can be accurately predicted, and the human input and trial-and-error cost can be reduced.
Drawings
In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method of intelligent rule generation in accordance with the present application;
FIG. 3 is a flow chart of one embodiment of step 201 of FIG. 2;
FIG. 4 is a flow chart of one embodiment of step 203 of FIG. 2;
FIG. 5 is a schematic diagram of the architecture of one embodiment of an intelligent rule generating apparatus according to the present application;
FIG. 6 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the method for generating an intelligent rule provided by the embodiment of the present application is generally executed by a terminal device, and accordingly, the intelligent rule generating device is generally disposed in the terminal device.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow chart of one embodiment of a method of intelligent rule generation in accordance with the present application is shown. The intelligent rule generation method comprises the following steps:
step 201, performing field slicing processing on the collected service data to extract slice fields, wherein the slice fields conform to field definitions of a field query library of a flow link corresponding to the service data.
In this embodiment, the electronic device (for example, the terminal device shown in fig. 1) on which the intelligent rule generating method operates may perform data interaction with the server through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.
In practical application, taking insurance claim as an example, the business process involves: the method comprises a case reporting step, a survey step, a loss assessment step, a settlement step, a payment step and the like, wherein each flow relates to interaction of data of a bottom layer, and in order to store data fields of each step at the moment, service data needs to be sliced.
In this embodiment, referring to fig. 3, step 201 specifically includes the following steps:
and 2011, acquiring service data.
In practical application, the specific service data includes the service data of the newly created service input by the service personnel, and may also include the service data returned by the intelligent engine after the execution is completed.
Step 2012, creating an asynchronous slice task of the business data.
In practical application, in order to avoid a large amount of database operations in a short time, an asynchronous task can be established, and the asynchronous task can use a caching mechanism, such as a message queue, firstly put data into the message queue, and then slowly process and write the data into the database. The task processing mode can respond to the user request without waiting for all operations to be completed, and the user experience is good.
In this embodiment, the processing of asynchronous slice tasks may be implemented based on the oracle's producer consumer model. The asynchronous slicing task refers to establishing a slicing task for each piece of service data, and then sending the slicing task to a message queue, wherein each slicing task in the message queue is an asynchronous slicing task.
Specifically, the step 2012 specifically includes:
creating a field slicing task of the service data;
transmitting the field slicing task to a message queue of kafka;
and sending the message queue in the kafka to an HDFS file.
And 2013, calling the asynchronous slicing task, and determining a field query library corresponding to the query according to the business links where the business data related to the asynchronous slicing task are located.
In this embodiment, a complete business process includes different processing links, business data related to each link is different, and corresponding field query libraries are also different. In the case of a report, the service data includes: time, place, person, etc.; the business data of the exploration link comprises: the survey staff, the survey object, the survey time, the survey result, etc., so the fields stored in the field query library of the two business links are also different.
In practice, the query library is typically located in the underlying database.
And 2014, extracting slice fields which logically accord with field definitions in the field query library from the service data by using a Java reflection mechanism.
In this embodiment, the extraction of data may be implemented by using a Java reflection mechanism, where in an operation state, for any class, all the attributes and methods of the class can be known; and a class defines the operations that all data of the object of the class can do so, so that any object of the class can call any of its data and attributes.
In this embodiment, the step 2014 includes:
reading the field slicing task from the HDFS file;
and extracting slice fields which logically conform to field definitions in the field query library from the service data by using a Java reflection mechanism.
And 202, constructing candidate factors according to the values of the extracted slice fields, and writing the candidate factors into a factor feature library.
In this embodiment, referring to fig. 4, step 202 includes:
step 2021, storing the value of the extracted slice field as an initial factor into a specified data platform.
In this embodiment, the value of the extracted field may be stored in a mongo db distributed database.
Step 2022, preprocessing the initial factors in the data platform according to a preset preprocessing rule.
In practical application, when the factor feature library is established, the initial factors of the search data platform can be preprocessed.
In this embodiment, the preprocessing rule includes:
filling the missing value: it means that missing values in the initial factors are filled in to avoid various errors. The common methods include filling default values, average values, modes, KNN filling and the like;
replacing the extreme values: the extreme values include maximum and minimum values, which can be replaced by average or front and back values;
data box division: the initial factors are classified according to a certain rule, and in this embodiment, two modes of complex and simple box division can be performed according to the type of the initial factors and the meaning of the service.
Step 2023, calculating the preprocessed initial factors according to a preset factor construction algorithm to obtain candidate factors.
In practical application, the factor construction algorithm can be designed by taking expert experience as a dominant, such as calculating time difference, amount ratio, judging whether the certificate number is consistent with the mobile phone number or not, and the like;
step 2024, writing the candidate factors into a factor feature library according to a preset character format.
In practical application, the constructed factors can be written into the factor feature library according to a preset character format (including factor names, factor symbols and factor values).
And 203, evaluating candidate factors in the factor feature library according to a preset evaluation function, and selecting available factors from the candidate factors according to an evaluation result.
In this embodiment, the candidate factors may be evaluated by using an evaluation function, and then, candidate factors whose evaluation scores reach a preset value may be selected as the available factors.
The formula of the preset evaluation function is as follows:
wherein Q is the evaluation score, p is the number of covered positive examples of the evaluated factor, N is the number of covered negative examples of the evaluated factor, Q is the number of covered positive examples of the whole sample, N is the number of covered negative examples of the whole sample, W is the coverage weight, the value of the coverage weight is a preset value of 0< W <1, the larger the value of W is, the larger the coverage is, and the smaller the accuracy is.
In this embodiment, the following steps may be added before step 204 after step 203:
e1, judging whether an available factor with the evaluation score reaching the preset value is selected, if yes, executing a step 204, otherwise, executing a step E2;
e2, reducing the number of covered positive examples of the samples or the evaluated factors;
in practical applications, the proportion of the reduction in the number of covered positive examples may be set, for example, 10% of the number of covered positive examples each time the last calculation is reduced.
And E3, judging whether the number of the covered positive examples after the reduction is lower than a preset threshold value, if so, determining that the generation of the rule fails, exiting the flow of the generation of the rule, otherwise, executing step 203.
And 204, combining the selected available factors according to a preset combination algorithm to generate a new generation rule.
Specifically, the method comprises the following steps:
forming the selected available factors into an available factor list;
combining factors in the available factor list by using a preset greedy search and distributed algorithm to obtain a plurality of rules;
pruning the rules by utilizing a pruning function, and extracting the rule with the maximum pruning function value as a new generation rule.
In practical application, the greedy search and the distributed algorithm can be directly utilized to process the factors to generate the rules, or the directional search can be firstly adopted to search the appointed factors in the appointed range, and then the greedy search and the distributed algorithm are utilized to process the factors to generate the rules.
Where greedy searching means that when solving a problem, the choice that is best seen currently is always made, i.e. the choice that is best or optimal (i.e. most advantageous) is taken in each choice step, hopefully resulting in an algorithm that results in the best or optimal. It should be noted that greedy search does not have a fixed algorithm solution framework, and the key of the algorithm is the selection of greedy strategies, and in practical application, different strategies need to be selected according to different problems.
The distributed algorithm is a solution that a large computing task is divided into a plurality of parts and is respectively delivered to other computers/servers for processing, and all computing results are combined into the original problem.
The greedy search and the distributed algorithm are combined, so that the problems of low calculation efficiency and difficult maintenance caused by overlarge calculation amount can be well solved.
The directional search may be a directed search of rules that have been pushed and executed, by which a correlation factor is derived. For example, the drunk driving rule 1 (including factors including risk time and whether the speed is high) and … … drunk driving rule n (including factors including whether the tree is knocked and whether the drunk driving rule is driven by a insured person) are obtained by directional search of the drunk driving rule, and the drunk driving factors of the drunk driving rules can be extracted from the factors including: the time of risk, whether high speed, … … is hit, whether driven by a insured life.
In practical application, the generated rules can be counted, the generation of the rules is stopped after the preset threshold value of the rules is reached, and the rules are preferentially selected according to the accuracy or coverage, namely the optimal rules are selected.
In this embodiment, the pruning function f= (p-n)/(p+n) may be used to calculate the pruning function of the original rule and the rule with each factor deleted, and the rule with the largest pruning function value may be extracted as the new generation rule.
Pruning strategies, which fall into the category of algorithm optimization, by which it can be determined which rule combinations should be discarded and which rule combinations should be retained.
For example, a rule G is obtained by combining the available factors, which includes factors Y1, Y2, Y3, and Y4, and then it is necessary to calculate a pruning function of the rule G, and further calculate a pruning function of the rule after deleting each factor, including: g1 (including factors Y2, Y3, Y4), G2 (including Y3, Y4), G3 (including Y4), G4 (including Y1, Y3, Y4), G5 (including Y1, Y4) … …. If the pruning function to get G1 is the largest then G1 is retained and the other rules are discarded.
In this embodiment, after step 204, the following steps may also be performed:
and updating the generated rule once every preset interval time.
In practical applications, factors in the data platform are increased continuously along with time, on one hand, rules with higher precision and coverage can be calculated by more factors, on the other hand, previous rules can no longer be applicable, so that the current factors (including the previous factors and the added factors) of the data platform are also required to be utilized to select available factors from the data platform according to a preset combination algorithm and combine the selected available factors to generate a new rule, the generated optimal rule covers the previous rule, and relevant information of the rule, such as replaced rule names, factors and the like, and names, factors and the like of the replaced rules are output.
In this embodiment, after step 204, the following steps may also be performed:
pushing the generated rule to an intelligent engine for execution;
and receiving the service data returned by the intelligent engine.
In this embodiment, after step 204, the following steps may be further performed:
and verifying the generated rule and outputting the rule with higher precision on the verification set.
The intelligent rule generating method of the embodiment realizes automatic generation of the business rule, reduces workload, improves working efficiency, does not depend on experience of people, and improves rule accuracy. And the self-learning self-adaption of rules can be realized, the threshold value set by the rules can be accurately predicted, and the human input and trial-and-error cost can be reduced.
Those skilled in the art will appreciate that implementing all or part of the above-described methods in accordance with the embodiments may be accomplished by way of a computer program stored in a computer-readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present application provides an embodiment of an intelligent rule generating apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus is specifically applicable to various electronic devices.
As shown in fig. 5, the intelligent rule generating apparatus 500 according to the present embodiment includes: a slice processing module 501, a feature library module 502, a factor selection module 503, and a rule generation module 504. Wherein:
the slice processing module 501 is configured to perform field slicing processing on the collected service data to extract a slice field, where the slice field conforms to a field definition of a field query library of a flow link corresponding to the service data;
the feature library module 502 is configured to construct candidate factors according to the values of the extracted slice fields, and write the candidate factors into a factor feature library;
a factor selection module 503, configured to evaluate candidate factors in the factor feature library according to a preset evaluation function, and select an available factor from the candidate factors according to an evaluation result;
the rule generating module 504 is configured to combine the selected available factors according to a preset combination algorithm to generate a new rule.
In this embodiment, the slice processing module 501 includes:
the service acquisition sub-module is used for acquiring service data;
the task creation sub-module is used for creating asynchronous slicing tasks of the service data;
the query library determining submodule is used for calling the asynchronous slicing task and determining a field query library corresponding to query according to a service link where service data related to the asynchronous slicing task are located;
and the slicing sub-module is used for extracting slicing fields which are logically consistent with field definitions in the field query library from the service data by using a preset number-taking algorithm.
In practical application, the slicing submodule can utilize a Java reflection mechanism to realize data extraction, and the mechanism can know all attributes and methods of any class in an operation state; a class defines the operations that all data of the object of the class can complete, so that any object of the class can call any data and attributes thereof, and the specific method is as follows:
reading the field slicing task from the HDFS file;
and extracting slice fields which logically conform to field definitions in the field query library from the service data by using a Java reflection mechanism.
In this embodiment, the feature library module 502 includes:
a storage sub-module for storing the value of the slice field as an initial factor into a designated data platform;
the preprocessing sub-module is used for preprocessing the initial factors in the data platform according to a preset preprocessing rule;
the factor construction sub-module is used for calculating the preprocessed initial factors according to a preset factor construction algorithm to obtain candidate factors;
and the factor writing sub-module is used for writing the candidate factors into a factor feature library according to a preset character format.
In this embodiment, the factor selection module 503 includes:
the evaluation submodule is used for evaluating the initial factors by utilizing a preset evaluation function, and the formula is as follows:
and the selecting sub-module is used for selecting candidate factors with the evaluation scores reaching a preset value as available factors.
Wherein Q is the evaluation score, p is the number of covered positive examples of the evaluated factor, N is the number of covered negative examples of the evaluated factor, Q is the number of covered positive examples of the whole sample, N is the number of covered negative examples of the whole sample, W is the coverage weight, and the value is a preset value of 0< W < 1.
In this embodiment, the factor selecting module 503 may further include a first judging sub-module, an adjusting sub-module, and a second judging sub-module:
the first judging submodule is used for judging whether an available factor with the evaluation score reaching the preset value is selected or not;
the adjusting sub-module is used for reducing the number of covered positive examples of the samples or the evaluated factors when the judging result of the first judging sub-module is negative;
the second judging submodule is used for judging whether the number of the covered positive examples after the reduction is lower than a preset threshold value or not;
the selecting sub-module is further configured to trigger the step of combining the selected available factors according to a preset combination algorithm to generate a new rule when the judging result of the second judging sub-module is negative;
at this time, the rule generating module is further configured to trigger the step of generating the new rule by combining the selected available factors according to a preset combination algorithm when the determination result of the first determining sub-module is yes, or further determine that the current rule generation fails and exit the current rule generation flow when the determination result of the second determining sub-module is yes.
In this embodiment, the rule generation module 504 includes:
the list generation sub-module is used for forming the selected available factors into an available factor list;
the factor combination sub-module is used for combining the available factors in the available factor list by using a preset greedy search and distributed algorithm to obtain a plurality of rules;
and the pruning sub-module is used for pruning the rules by utilizing a pruning function and extracting the rule with the maximum pruning function value as a new generation rule.
In practical application, the pruning submodule pruning the rules by using a pruning function includes:
calculating the pruning function of the original rule and the rule with each factor deleted by using the pruning function F= (p-n)/(p+n);
where p is the number of covered positive examples of the evaluated factor, and n is the number of covered negative examples of the evaluated factor.
In practical applications, the rule generating module 503 may further include a factor preprocessing sub-module, configured to pre-process the factors of the search data platform in advance when the factor feature library is established.
In this embodiment, the preprocessing of the factors includes:
filling the missing value: it means that missing values in the factors are filled in to avoid various errors. The common methods include filling default values, average values, modes, KNN filling and the like;
replacing the extreme values: the extreme values include maximum and minimum values, which can be replaced by average or front and back values;
data box division: the factors are classified according to a certain rule, and in this embodiment, two modes of complex and simple box division can be performed according to the factor type and business meaning.
In practical application, the intelligent rule generating apparatus 500 may further include:
and the updating module is used for triggering the intelligent rule generating device 500 to update the generated rule once every preset interval time.
In practical application, the intelligent rule generating apparatus 500 may further include:
the pushing module is used for pushing the generated rule to the intelligent engine for execution;
and the receiving module is used for receiving the service data returned by the intelligent engine.
In practical application, the intelligent rule generating apparatus 500 may further include:
and the verification module is used for verifying the generated rule and outputting the rule with higher precision on the verification set.
The intelligent rule generating method of the embodiment realizes automatic generation of the business rule, reduces workload, improves working efficiency, does not depend on experience of people, and improves rule accuracy. And the self-learning self-adaption of rules can be realized, the threshold value set by the rules can be accurately predicted, and the human input and trial-and-error cost can be reduced.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is generally used to store an operating system and various types of application software installed on the computer device 6, such as program codes of an intelligent rule generating method. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.
The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute the program code stored in the memory 61 or process data, such as the program code for executing the intelligent rule generating method.
The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.
The present application also provides another embodiment, namely, a computer-readable storage medium storing an intelligent rule generating method program executable by at least one processor to cause the at least one processor to perform the steps of the intelligent rule generating method as described above.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims (7)

1. An intelligent rule generating method is characterized by comprising the following steps:
performing field slicing processing on the acquired service data to extract slice fields, wherein the slice fields conform to field definitions of a field query library of a flow link corresponding to the service data;
constructing candidate factors according to the values of the extracted slice fields, and writing the candidate factors into a factor feature library;
evaluating candidate factors in the factor feature library according to a preset evaluation function, and selecting available factors from the candidate factors according to an evaluation result;
combining the selected available factors according to a preset combination algorithm to generate a new generation rule;
the step of evaluating candidate factors in the factor feature library according to a preset evaluation function and selecting available factors from the candidate factors according to an evaluation result comprises the following steps:
and evaluating the initial factor by using a preset evaluation function, wherein the formula is as follows:
selecting candidate factors with the evaluation score reaching a preset value as available factors;
wherein Q is an evaluation score, p is the number of covered positive examples of the evaluated factor, N is the number of covered negative examples of the evaluated factor, Q is the number of covered positive examples of the whole sample, N is the number of covered negative examples of the whole sample, W is a coverage weight, and the value of the coverage weight is a preset value of 0< W < 1;
the step of combining the selected available factors according to a preset combination algorithm to generate a new generation rule comprises the following steps:
forming the selected available factors into an available factor list;
combining the available factors in the available factor list by using a preset greedy search and a distributed algorithm to obtain a plurality of rules;
pruning the rules by utilizing a pruning function, and extracting the rule with the maximum pruning function value as a new generation rule;
the pruning the plurality of rules with a pruning function includes:
calculating the pruning function of the original rule and the rule with each factor deleted by using the pruning function F= (p-n)/(p+n);
where p is the number of covered positive examples of the evaluated factor, and n is the number of covered negative examples of the evaluated factor.
2. The intelligent rule generating method according to claim 1, wherein the steps of constructing candidate factors from the values of the extracted slice fields and writing the candidate factors into a factor feature library include:
storing the value of the slice field as an initial factor into a designated data platform;
preprocessing the initial factors in the data platform according to a preset preprocessing rule;
calculating the preprocessed initial factors according to a preset factor construction algorithm to obtain candidate factors;
and writing the candidate factors into a factor feature library according to a preset character format.
3. The intelligent rule generating method according to claim 1, wherein after the step of evaluating candidate factors in the factor feature library according to a preset evaluation function and selecting available factors from the candidate factors according to an evaluation result, the method further comprises:
judging whether an available factor with the evaluation score reaching the preset value is selected or not;
if yes, triggering the step of combining the selected available factors according to a preset combination algorithm to generate a new generation rule, otherwise, reducing the number of covered positive examples of the samples or the evaluated factors;
and judging whether the number of the covered positive examples after the reduction is lower than a preset threshold value, if so, determining that the generation of the rule fails, exiting the flow of the generation of the rule, otherwise triggering the step of selecting the available factors from the candidate factors according to the evaluation result.
4. The intelligent rule generating method according to claim 1, wherein the step of performing field slicing processing on the collected service data to extract slice fields comprises:
acquiring service data;
creating an asynchronous slicing task of the service data;
the asynchronous slicing task is called, and a field query library corresponding to query is determined according to the business links where the business data related to the asynchronous slicing task are located;
and extracting slice fields which logically conform to field definitions in the field query library from the service data by using a Java reflection mechanism.
5. An intelligent rule generating apparatus, comprising:
the slice processing module is used for carrying out field slicing processing on the acquired service data to extract slice fields, wherein the slice fields conform to field definitions of a field query library of a flow link corresponding to the service data;
the feature library module is used for constructing candidate factors according to the values of the extracted slice fields and writing the candidate factors into a factor feature library;
the factor selection module is used for evaluating candidate factors in the factor feature library according to a preset evaluation function and selecting available factors from the candidate factors according to an evaluation result;
the rule generation module is used for combining the selected available factors according to a preset combination algorithm to generate a new rule;
the factor selection module includes:
the evaluation submodule is used for evaluating the initial factors by utilizing a preset evaluation function, and the formula is as follows:
the selecting submodule is used for selecting candidate factors with the evaluation scores reaching a preset value as available factors;
wherein Q is an evaluation score, p is the number of covered positive examples of the evaluated factor, N is the number of covered negative examples of the evaluated factor, Q is the number of covered positive examples of the whole sample, N is the number of covered negative examples of the whole sample, W is a coverage weight, and the value of the coverage weight is a preset value of 0< W < 1;
the rule generation module includes:
the list generation sub-module is used for forming the selected available factors into an available factor list;
the factor combination sub-module is used for combining the available factors in the available factor list by using a preset greedy search and distributed algorithm to obtain a plurality of rules;
the pruning submodule is used for pruning the rules by utilizing a pruning function and extracting the rule with the maximum pruning function value as a new rule;
the pruning sub-module pruning the plurality of rules using a pruning function includes:
calculating the pruning function of the original rule and the rule with each factor deleted by using the pruning function F= (p-n)/(p+n);
where p is the number of covered positive examples of the evaluated factor, and n is the number of covered negative examples of the evaluated factor.
6. A computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the intelligent rule generation method of any one of claims 1 to 4 when the computer program is executed.
7. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the intelligent rule generating method according to any one of claims 1 to 4.
CN201910433971.8A 2019-05-23 2019-05-23 Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium Active CN110348669B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910433971.8A CN110348669B (en) 2019-05-23 2019-05-23 Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910433971.8A CN110348669B (en) 2019-05-23 2019-05-23 Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN110348669A CN110348669A (en) 2019-10-18
CN110348669B true CN110348669B (en) 2023-08-22

Family

ID=68174260

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910433971.8A Active CN110348669B (en) 2019-05-23 2019-05-23 Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110348669B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100250B (en) * 2020-11-23 2021-03-16 支付宝(杭州)信息技术有限公司 Data processing method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003076937A (en) * 2001-09-06 2003-03-14 Shinichi Morishita Method and system for extracting association rule and association rule extraction program
CN108182515A (en) * 2017-12-13 2018-06-19 中国平安财产保险股份有限公司 Intelligent rules engine rule output method, equipment and computer readable storage medium
CN109409648A (en) * 2018-09-10 2019-03-01 平安科技(深圳)有限公司 Claims Resolution air control method, apparatus, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003076937A (en) * 2001-09-06 2003-03-14 Shinichi Morishita Method and system for extracting association rule and association rule extraction program
CN108182515A (en) * 2017-12-13 2018-06-19 中国平安财产保险股份有限公司 Intelligent rules engine rule output method, equipment and computer readable storage medium
CN109409648A (en) * 2018-09-10 2019-03-01 平安科技(深圳)有限公司 Claims Resolution air control method, apparatus, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110348669A (en) 2019-10-18

Similar Documents

Publication Publication Date Title
CN107832291B (en) Man-machine cooperation customer service method, electronic device and storage medium
WO2021190379A1 (en) Method and device for realizing automatic machine learning
CN112925911B (en) Complaint classification method based on multi-modal data and related equipment thereof
CN115130711A (en) Data processing method and device, computer and readable storage medium
CN113010542B (en) Service data processing method, device, computer equipment and storage medium
CN110348669B (en) Intelligent rule generation method, intelligent rule generation device, computer equipment and storage medium
CN110443441B (en) Rule efficiency monitoring method, device, computer equipment and storage medium
CN116661936A (en) Page data processing method and device, computer equipment and storage medium
CN116450723A (en) Data extraction method, device, computer equipment and storage medium
US20220327147A1 (en) Method for updating information of point of interest, electronic device and storage medium
CN112182107B (en) List data acquisition method, device, computer equipment and storage medium
CN110196837B (en) Document editing method, device, computer equipment and storage medium
CN114925275A (en) Product recommendation method and device, computer equipment and storage medium
CN116883048B (en) Customer data processing method and device based on artificial intelligence and computer equipment
CN111143328A (en) Agile business intelligent data construction method, system, equipment and storage medium
CN112085087B (en) Business rule generation method, device, computer equipment and storage medium
CN114328214B (en) Efficiency improving method and device for interface test case of reporting software and computer equipment
CN115941712B (en) Method and device for processing report data, computer equipment and storage medium
CN116932486A (en) File generation method, device, computer equipment and storage medium
CN116450724A (en) Data processing method, device, computer equipment and storage medium
CN116821210A (en) Blacklist query method, blacklist query device, computer equipment and storage medium
CN116627416A (en) Page configuration method, page configuration device, computer equipment and storage medium
CN116594742A (en) Task scheduling method, device, computer equipment and storage medium
CN117217684A (en) Index data processing method and device, computer equipment and storage medium
CN117522538A (en) Bid information processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant