CN114548830B - Selection operator determining method, strategy combination optimizing method and device - Google Patents

Selection operator determining method, strategy combination optimizing method and device Download PDF

Info

Publication number
CN114548830B
CN114548830B CN202210405947.5A CN202210405947A CN114548830B CN 114548830 B CN114548830 B CN 114548830B CN 202210405947 A CN202210405947 A CN 202210405947A CN 114548830 B CN114548830 B CN 114548830B
Authority
CN
China
Prior art keywords
combination
strategy
condition
determining
policy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210405947.5A
Other languages
Chinese (zh)
Other versions
CN114548830A (en
Inventor
顾咏丰
宁跃
吴华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202210405947.5A priority Critical patent/CN114548830B/en
Publication of CN114548830A publication Critical patent/CN114548830A/en
Application granted granted Critical
Publication of CN114548830B publication Critical patent/CN114548830B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Engineering & Computer Science (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the specification provides a method for determining a selection operator, a method and a device for optimizing strategy combination. In the method for determining the selection operator, a target condition and a constraint condition during selection of the strategy combination are determined, the selection operator to be determined is constructed based on the target condition and the constraint condition, the selection operator comprises a plurality of basic operators and corresponding coefficients to be determined, a plurality of groups of coefficient values of the coefficients are determined, the corresponding selection operator to be selected is obtained, for any selection operator to be selected, a strategy combination with the highest recognition effect score is selected from a first strategy total set by utilizing a plurality of risk user samples and the selection operator to be selected, a target value of a recognition result corresponding to the strategy combination is determined and is used as a target value corresponding to the selection operator, and when the plurality of groups of selection operators and the corresponding target values are obtained, the selection operator corresponding to the target value meeting the preset optimal condition is determined as the selection operator.

Description

Selection operator determining method, strategy combination optimizing method and device
Technical Field
One or more embodiments of the present disclosure relate to the field of data processing technologies, and in particular, to a method for determining a selection operator, a method for policy combination optimization, and an apparatus for policy combination optimization.
Background
With the development of society and the progress of science and technology, more and more service platforms emerge, and various services are provided for users so as to meet various requirements of the users in life and work. In the process of providing services to users by the service platform, decisions related to the services are often made by using a pre-established policy. For example, the credit platform uses a preset policy to determine whether a user is a risky user, and thus, whether to provide credit loan services to the user. In practical application, in order to optimize the decision effect, a policy aggregate is often established first, and the most suitable decision combination is selected from the policy aggregate for use.
It is therefore desirable to have an improved solution that improves the effectiveness of selecting the most appropriate combination of policies from the overall set of policies.
Disclosure of Invention
One or more embodiments of the present specification describe a selection operator determining method, a policy combination optimizing method, and an apparatus to determine a more suitable selection operator, thereby improving the effect of selecting the most suitable policy combination from a total set of policies. The specific technical scheme is as follows.
In a first aspect, an embodiment provides a method for determining a selection operator in a policy combination, where the selection operator is used to determine, when a policy combination is selected from a first policy aggregate, recognition effect scores of the policy combination for recognition results of multiple risk user samples, and the policy combination is used to identify a risk user; the method comprises the following steps:
Determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination as a target value corresponding to the to-be-selected operator;
and when a plurality of groups of operators to be selected and corresponding target values are obtained, determining the operators to be selected corresponding to the target values meeting the preset optimal conditions as the selection operators.
In one embodiment, the step of constructing a selection operator to be determined based on the target condition and constraint condition comprises:
Constructing a plurality of base operators based on the target values in the target conditions and the constraint values in the constraint conditions;
and combining the plurality of base operators based on the coefficient to be determined distributed to each base operator to obtain the selected operator to be determined.
In one embodiment, the step of determining a plurality of sets of values of the coefficients includes:
determining a group of coefficient values of the coefficients to obtain corresponding operators to be selected;
after determining the target value corresponding to the candidate selection operator, the method further includes:
and inputting the target value and the corresponding group of coefficient values into a Bayesian model, determining an updated group of coefficient values through the Bayesian model, and returning to execute the step of selecting the strategy combination with the highest recognition effect score from the first strategy total set by using the plurality of risk user samples and the to-be-selected operator.
In one embodiment, the step of determining a plurality of sets of values of the coefficients includes:
and determining multiple groups of coefficient values of the coefficients by utilizing a random search algorithm or a grid search algorithm.
In one embodiment, the step of selecting a policy combination with the highest recognition score from the first policy aggregate by using a plurality of risk user samples and the candidate selection operator includes:
Determining a plurality of groups of policy combinations from the first aggregate of policies;
determining the identification results of the strategy combination aiming at a plurality of risk user samples aiming at any group of strategy combinations, and determining the identification effect scores of the identification results by using the to-be-selected selection operator;
and when the multiple groups of strategy combinations and the corresponding recognition effect scores are obtained, determining the strategy combination corresponding to the highest recognition effect score as the selected strategy combination.
In one embodiment, the step of determining a plurality of sets of policy combinations from the first aggregate of policies comprises:
determining an initial plurality of sets of policy combinations from the first aggregate set of policies;
after determining the selected policy combination from the initial plurality of sets of policy combinations, further comprising:
and adding a plurality of selectable strategies except the selected strategy combination in the first strategy total set to the selected strategy combination respectively to obtain a plurality of updated strategy combinations, and returning to execute the step of determining the identification results of the strategy combination for a plurality of risk user samples aiming at any group of strategy combinations.
In one embodiment, any one of the risk user patterns includes user characteristics of the corresponding user, and the policy in any one of the policy combinations includes: a discrimination condition set based on the user characteristics, and a risk discrimination result when the discrimination condition is satisfied.
In one embodiment, the target value includes the number of identified risky users, and the preset preferred condition includes a maximum value of the number of risky users; the constraint value comprises the number of identified non-risk users, and the preset limiting condition comprises the minimum value of the number of the non-risk users.
In one embodiment, the target value comprises an abnormal transaction amount of the identified risky user, and the preset preferred condition comprises that the abnormal transaction amount takes a maximum value; the constraint value comprises a normal transaction amount of the identified risk user, and the preset limiting condition comprises that the normal transaction amount is the minimum value.
In a second aspect, an embodiment provides a method for optimizing a policy combination, for selecting a policy combination from a first policy aggregate by using a selection operator, where the selection operator is used to determine recognition effect scores of recognition results of the policy combination for a plurality of risk user samples, and the policy combination is used to identify a risk user, and the method includes:
determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
Constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination;
and when the target values corresponding to the multiple groups of strategy combinations are obtained, determining the strategy combination corresponding to the target value meeting the preset optimal condition as the optimized strategy combination.
In a third aspect, an embodiment provides a method for optimizing a policy combination, including:
acquiring a second strategy aggregate to be optimized;
obtaining the selection operator determined in the first aspect;
and selecting the strategy combination with the highest recognition effect score from the second strategy total set as the optimized strategy combination by utilizing a plurality of risk user samples and the determined selection operator.
In a fourth aspect, an embodiment provides a method for determining a selection operator in a policy combination, where the selection operator is used to determine recognition effect scores of recognition results of policy combinations for a plurality of task labeling samples when the policy combinations are selected from a first policy aggregate, and the policy combinations are used to execute specified recognition tasks; the method comprises the following steps:
Determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing a plurality of task marking samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination as a target value corresponding to the to-be-selected operator;
and when a plurality of groups of operators to be selected and corresponding target values are obtained, determining the operators to be selected corresponding to the target values meeting the preset optimal conditions as the selection operators.
In a fifth aspect, an embodiment provides a selection operator determining device in a policy combination, where the selection operator is used to determine, when a policy combination is selected from a first policy aggregate, an identification effect score of the policy combination for identification results of a plurality of risk user samples, and the policy combination is used to identify a risk user; the device comprises:
The first determining module is configured to determine target conditions and constraint conditions when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module configured to construct a selection operator to be determined based on the target condition and constraint condition, the selection operator including a plurality of base operators and corresponding coefficients to be determined;
a second determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition score from the first policy aggregate by using multiple risk user samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
and the third determining module is configured to determine the candidate selection operator corresponding to the target value meeting the preset optimal condition as the selection operator when multiple groups of candidate selection operators and corresponding target values are obtained.
In a sixth aspect, an embodiment provides an apparatus for optimizing a policy combination, configured to select a policy combination from a first policy aggregate by using a selection operator, where the selection operator is used to determine recognition effect scores of recognition results of the policy combination for multiple risk user samples, and the policy combination is used to identify a risk user, and the apparatus includes:
the first determining module is configured to determine a target condition and a constraint condition when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module configured to construct a selection operator to be determined based on the target condition and the constraint condition, the selection operator including a plurality of base operators and corresponding coefficients to be determined;
a second determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition effect score from the first policy aggregate by using multiple risk user samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination;
And the first optimization module is configured to determine the strategy combination corresponding to the target value meeting the preset optimal condition as the optimized strategy combination when the target value corresponding to the multiple groups of strategy combinations is obtained.
In a seventh aspect, an embodiment provides an apparatus for optimizing a policy combination, including:
the first obtaining module is configured to obtain a second strategy total set to be optimized;
a second obtaining module configured to obtain the selection operator determined in the first aspect;
and the second optimization module is configured to select the strategy combination with the highest recognition effect score from the second strategy total set as the optimized strategy combination by utilizing a plurality of risk user samples and the determined selection operator.
In an eighth aspect, embodiments provide a selection operator determining device in a policy combination, where the selection operator is configured to determine, when a policy combination is selected from a first policy aggregate, recognition effect scores of recognition results of the policy combination for a plurality of task annotation samples, and the policy combination is configured to execute a specified recognition task; the device comprises:
the first determining module is configured to determine a target condition and a constraint condition when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
A first construction module configured to construct a selection operator to be determined based on the target condition and constraint condition, the selection operator including a plurality of base operators and corresponding coefficients to be determined;
a fourth determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition effect score from the first policy aggregate by using multiple task labeling samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
and the third determining module is configured to determine the to-be-selected operators corresponding to the target values meeting the preset optimal conditions as the selection operators when multiple groups of to-be-selected operators and corresponding target values are obtained.
In a ninth aspect, embodiments provide a computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method of any one of the first to fourth aspects.
In a tenth aspect, an embodiment provides a computing device, including a memory and a processor, where the memory stores executable code, and the processor executes the executable code to implement the method of any one of the first to fourth aspects.
In the method and the device provided by the embodiment of the specification, the selection operator to be determined is constructed based on the target condition and the constraint condition, different values are taken for coefficients in the selection operator to obtain corresponding different selection operators to be selected, each selection operator to be selected is used for selecting the strategy combination with the highest recognition effect score from the strategy total set, and the corresponding selection operator to be selected is selected as the determined selection operator according to the target value of the recognition result corresponding to the strategy combination. According to the embodiment of the specification, the selection operator does not need to be constructed manually, time and labor are consumed, the selection operator with a more proper effect is selected from the selection operators to be selected in a self-adaptive mode through construction of the selection operator to be selected, and when the effect of the selection operator is improved, the effect of selecting the most proper strategy combination from the strategy total set can be correspondingly improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic diagram of an implementation scenario of an embodiment disclosed in the present specification;
fig. 2 is a schematic flowchart of a method for determining a selection operator in a policy combination according to an embodiment;
FIG. 3 is a flowchart illustrating a policy combination optimization method according to an embodiment;
FIG. 4 is a flowchart illustrating a method for optimizing a policy combination according to an embodiment;
fig. 5 is a schematic flowchart of a method for determining a selection operator in a policy combination according to an embodiment;
FIG. 6 is a schematic block diagram of an apparatus for determining a selection operator in a policy combination according to an embodiment;
FIG. 7 is a schematic block diagram of an apparatus for optimizing a combination of policies according to an embodiment;
FIG. 8 is a schematic block diagram of an apparatus for optimizing a combination of policies according to an embodiment;
fig. 9 is a schematic block diagram of a selection operator determining apparatus in a policy combination according to an embodiment.
Detailed Description
The scheme provided by the specification is described below with reference to the accompanying drawings.
In many business scenarios, there are thousands of alternative policies in the total set of policies, and the most suitable policy subset (i.e. policy combination) needs to be selected from them for executing business decisions in the corresponding scenario. The policy collection is also called a policy pool or a policy base, and the policies are also called rules. Generally, a certain policy combination can be selected from the policy aggregate, a plurality of risk user samples are identified by using the policy combinations to obtain an identification result, and an identification effect score of the policy combination is determined from the identification result by using a selection operator. In this way, the recognition effect scores of a plurality of sets of policy combinations can be obtained, and the policy combination corresponding to the highest recognition effect score is determined as the most appropriate policy combination. Fig. 1 is a schematic view of an implementation scenario of an embodiment disclosed in this specification, where a total policy set exemplarily includes 12 policies, such as (i), (ii), and (iii), policy combinations can be selected from the total policy set, for example, (i), (ii), (iii), and (iv), and these policy combinations are used to identify a plurality of user samples, respectively obtain corresponding identification results, and a selection operator is used to determine an identification effect score from the identification results, so that the most appropriate policy combination can be found according to the highest identification effect score. When the policy combination is selected from the policy aggregate in the above process, a certain algorithm, such as a greedy algorithm or other algorithms, may be adopted.
The selection operator is used for selecting the most appropriate strategy combination from the strategy combinations, the recognition effect score of the recognition result can be calculated through the selection operator and used as the recognition effect of the corresponding strategy combination, and then the recognition effect of the strategy combination on the risk user is evaluated through the numerical value of the recognition effect score. That is, the recognition result is calculated by using the selection operator, and the obtained operator value is the recognition effect score. The selection operator may be expressed as a functional mapping or calculation formula, e.g. y = f (x), where x represents a parameter in the recognition result, y may be understood as the selection operator, and y represents the calculated recognition result score.
The proper selection operator is designed to select the proper subset of rules. In general, the selection operator may be defined by an expert in a certain field in combination with domain knowledge and business knowledge. However, this manual definition is time and labor intensive.
The embodiment of the specification provides a method for determining a selection operator, which can adaptively determine a preferred selection operator, and further can improve the effect of selecting the most appropriate policy combination from a policy aggregate by using the preferred selection operator. In the method, the following steps are included: s210, determining a target condition and a constraint condition when selecting a strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition; s220, constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined; s230, determining values of multiple groups of coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination as a target value corresponding to the to-be-selected operator; s240, when a plurality of groups of operators to be selected and corresponding target values are obtained, the operators to be selected corresponding to the target values meeting the preset optimal conditions are determined as the selection operators.
According to the embodiment, the selection operator to be selected is constructed, the target value corresponding to the selection operator is utilized, the selection operator with better effect is selected from the multiple selection operators in a self-adaptive manner, and the selection operator does not need to be constructed in a manual manner in a time-consuming and labor-consuming manner, so that the selection operator with better effect can be obtained in a self-adaptive manner, and the effect of selecting the most appropriate strategy combination from the strategy aggregate is improved. The present embodiment will be described in detail with reference to fig. 2.
Fig. 2 is a schematic flowchart of a method for determining a selection operator in a policy combination according to an embodiment. Wherein the selection operator is used for determining the recognition effect scores of the recognition results of the strategy combination for the multiple risk user samples when the strategy combination is selected from the first strategy aggregate Z1. That is, a policy combination is used to identify a risky user (i.e., an identification result) from a plurality of risky user samples, and a selection operator is used to determine an identification effect score of the risky user. The first policy aggregate Z1 is any one of those created for identifying risky users.
The multiple risk user samples may be understood as multiple user samples including risk users, for example, N user samples including risk users and normal users (i.e., non-risk users), where a risk user refers to a user with risk. These N user samples may also be referred to as risk user samples, but this does not represent that when a risk user sample is mentioned, the sample must be a risk user. Any one sample of at-risk users includes user characteristics of the corresponding user.
Any policy combination may contain at least one policy, and any policy includes: a judgment condition set based on the user characteristics and a risk judgment result when the judgment condition is met. One strategy can be expressed as IF < decision condition > THEN < decision result >. For example, a policy is: IF < user's bad transaction record is greater than 1> and < user's number of outstanding payments due by date is greater than 2> THEN < user is a risky user >.
The method embodiments may be performed by any computing device that may be implemented by any apparatus, device, platform, cluster of devices, etc. having computing and processing capabilities.
In step S210, target conditions and constraint conditions when selecting a policy combination are determined. The target condition comprises that the target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that the constraint value aiming at the recognition result meets a preset limiting condition.
In one scenario, the goal is to identify as many risky users as possible, and the constraint is to reduce the disturbance to normal users as much as possible. The target value obj may include the number of identified risky users, the preset preferred condition includes a maximum value of the number of risky users, and the target condition may be denoted as max (obj). The constraint value cons includes the number of identified non-risk users, the preset constraint condition includes the minimum value of the number of non-risk users, and the constraint condition may be expressed as min (cons). In one embodiment, the goals and constraints described above may be interchanged.
In another scenario, the goal is to have as many black transaction sums as possible for the identified risky users, with the constraint being that the white transaction sums for the identified risky users are as few as possible. Black transactions refer to abnormal transactions and white transactions refer to normal transactions. The target value obj includes the abnormal transaction amount of the identified risky user, the preset preferred condition includes that the abnormal transaction amount takes a maximum value, and the target condition may be expressed as max (obj). The constraint value cons includes the normal transaction amount of the identified risky user, the preset limit condition includes that the normal transaction amount takes the minimum value, and the constraint condition may be expressed as min (cons). The goals and constraints in this scenario may be interchanged.
Target conditions and constraint conditions are generally mutually restrictive, inversely required. The target conditions may include requirements for one or more target values, and the constraint conditions may also include requirements for one or more constraint values. In the above two paragraphs, the target condition only includes the requirement for one target value, and the constraint condition includes the example for one constraint value.
In this step, the target condition and the constraint condition may be determined based on the input operation of the developer; the target conditions and the constraint conditions can also be stored in the computing device in advance, and when the selection operator needs to be determined, the prestored target conditions and the constraint conditions are obtained.
In step S220, a selection operator to be determined is constructed based on the target conditions and constraint conditions. The selection operator comprises a plurality of base operators and corresponding coefficients to be determined. The selection operator may comprise one or more base operators, each base operator having a corresponding coefficient. The coefficient can be regarded as the proportion of the base operator in the selection operator, and can be understood as the role of the weight.
In specific implementation, several base operators and a combination form among the base operators can be constructed based on the target value obj and the preset preferred condition, the constraint value cons and the preset limiting condition.
For example, several base operators may be constructed based on the target value obj and the constraint value cons; and combining the plurality of base operators based on the coefficient to be determined distributed to each base operator to obtain the selected operator to be determined. The combination may be a linear combination or a nonlinear combination.
For example, the selection operator to be determined can be represented by the following formula:
Figure 751446DEST_PATH_IMAGE001
wherein, the indicator is a selection operator to be determined, sub-indicator i Is the ith base operator, θ i Is the coefficient of the ith base operator, and K is the number of base operators.
Using the target value obj and the constraint value cons, the following operators are obtained: obj, 1/cons, obj/cons, cons, 1/obj, cons/obj, obj × cons, 1/obj × cons, etc.
The idea of constructing the above-described radix operator is described below. The general requirements of the goal and the constraint are opposite, e.g. the larger one the better, the smaller the other the better. In the following, the larger the target value obj, the better the constraint cons, the smaller the constraint cons. Based on this property, when constructing the basis operators using the target values and the constraint values, the basis operators that can be obtained are obj and 1/cons. With these two base operators, it is conceivable that there are many combinations between obj and 1/cons to express the desired operator, e.g., multiplication of the two results in a new base operator obj/cons.
Since obj may be as small as possible and cons as large as possible, the base operators 1/obj, cons/obj can be obtained accordingly. For the sake of a complete description, the base operators may also contain mixture terms, i.e., obj × cons, and 1/obj × cons. Thus, the sub _ indicator of the base operator can be obtained i The following expression forms of (a):
obj,1/cons,obj/cons,cons,1/obj,cons/obj,obj*cons,1/obj*cons。
in practical applications, obj can also be replaced by δ obj. δ obj represents the gain value given to the value of the selection operator when adding a policy to the policy combination.
The above description has been given only by taking one target value and one constraint value as an example, and the basic operator when the target value is multiple or the constraint value is multiple can be easily obtained from the above basic operator construction idea.
In step S230, multiple sets of coefficient values of the coefficients are determined to obtain corresponding to-be-selected operator indicators j Aiming at any one operator indicator to be selected j Using multiple risk user samples and the candidate selector indicator j Selecting the strategy combination with the highest recognition effect score from the first strategy aggregate Z1, and determining the target value of the recognition result corresponding to the strategy combination as the candidate selection operator indicator j The corresponding target value.
The coefficients of the base operator may be one or more, and a set of coefficients comprises values of the coefficients of several base operators, e.g. a set of coefficients θ j May be expressed as theta j ={θ 1 j2 j ,…,θ K j }. Each coefficient may take on a value within a preset range of values.
In determining the coefficientsWhen values of multiple groups of coefficients are taken, the values of the multiple groups of coefficients of the coefficients can be determined by utilizing a random search algorithm or a grid search algorithm to obtain multiple theta j
In order to improve the searching efficiency of the coefficient, a Bayesian algorithm can be adopted to determine the next group of coefficient values by utilizing the current group of coefficient values and the corresponding target values. Step S230, when executed, may be iteratively executed according to the following loop:
step 1a, determining a group of coefficient values theta of the coefficient j To obtain the corresponding operator indicator to be selected j . Initially, the coefficients may be randomly valued.
Step 2a, aiming at any one operator indicator to be selected j Using a plurality of risk user samples and the candidate selector indicator j Selecting the strategy combination with the highest recognition effect score from the first strategy aggregate Z1, and determining the target value of the recognition result corresponding to the strategy combination as the candidate selection operator indicator j The corresponding target value.
Step 3a, determining a candidate selection operator indicator j And after the corresponding target value, inputting the target value and the corresponding group of coefficient values into a Bayesian model, determining an updated group of coefficient values through the Bayesian model, and returning to the step 2a, namely, selecting the strategy combination with the highest recognition effect score from the first strategy total set Z1 by using a plurality of risk user samples and the operator to be selected.
The Bayesian model is a model trained by using a Bayesian algorithm. The iterative process shown in steps 1 a-3 a can be terminated when the termination condition is met. The termination condition may be that the number of groups of the obtained coefficient values is greater than a preset threshold, or that the target value satisfies a preset preference condition, for example, the target value is greater than a certain threshold, or the target value is less than a certain threshold.
In step S230, an indicator is selected for any one candidate selection operator j When the policy combination with the highest recognition effect score is selected from the first policy aggregate Z1 by using a plurality of risk user samples and the candidate selection operator, various embodiments may be adopted. Example (B)For example, multiple groups of policy combinations are determined from the first policy aggregate Z1, and multiple groups of policy combinations can be obtained by adopting a random combination mode; aiming at any group of strategy combination, determining that the strategy combination identifies a plurality of risk user samples to obtain an identification result containing risk users, and utilizing the candidate selection operator indicator j And determining the recognition effect score corresponding to the recognition result. In this way, a plurality of sets of policy combinations and corresponding recognition effect scores can be obtained, and thus the policy combination with the highest recognition effect score can be determined as the selected policy combination.
In order to improve the efficiency of finding the strategy combination with the highest recognition effect score, the following loop process of steps 1 b-4 b can be adopted to iteratively find the strategy combination with the highest recognition effect score:
step 1b, an initial plurality of sets of policy combinations is determined from the first set of policies Z1. Initially, any one of the first policy combinations Z1 may be treated as a set of policy combinations;
Step 2b, aiming at any one group of strategy combination, determining the identification results of the strategy combination aiming at a plurality of risk user samples, and determining the identification effect score of the identification results by using the to-be-selected operator;
step 3b, when obtaining the multiple groups of strategy combinations and corresponding recognition effect scores, determining the strategy combination corresponding to the highest recognition effect score as the selected strategy combination;
and 4b, adding a plurality of selectable strategies except the selected strategy combination in the first strategy total set Z1 to the selected strategy combination respectively to obtain a plurality of updated strategy combinations, and returning to execute the step 2 b.
When step 2b is executed, when a group of policy combinations includes a plurality of policies, each policy may identify a plurality of risk user samples to obtain identified risk users; and when the risk users respectively corresponding to the multiple strategies are obtained, summing the risk users to obtain the identification result of the strategy combination.
When the loop iteration process shown in steps 1 b-4 b meets the stop criterion, the loop iteration process can be terminated. The stopping criterion may be when the recognition effect score reaches a maximum value, or when the recognition effect score starts to decrease. Since the targets and constraints are usually inversely varied and mutually restricted, the number of risk users identified from a plurality of risk user samples does not increase all the time, and when a certain degree is reached, the effect of the constraint value is displayed, so that the value of the constraint identification effect score cannot be increased.
When there are multiple candidate selection operators indicator j Aiming at any one candidate selection operator indicator j The above-mentioned loop iteration process is executed once, and any one candidate selection operator indicator is found j And determining the target value of the identification result corresponding to the strategy combination. For example, when the target value is the number of risky users, the number of risky users in the identification result corresponding to the policy combination may be determined; when the target value is the abnormal transaction amount, the abnormal transaction amount of the insurance user in the identification result corresponding to the strategy combination can be determined. The operator to be selected, the strategy combination with the highest recognition effect score and the target value are in a one-to-one correlation relationship. Therefore, a plurality of candidate selection operators indicator can be obtained j Respectively corresponding target values.
In step S240, when multiple sets of candidate selection operators are obtained j And when the target value corresponds to the target value, determining the operator to be selected corresponding to the target value meeting the preset optimal condition as the selection operator. There may be more than one or one target value satisfying the preset preference condition, and there may be one or more determined selection operators.
The target value may be used to measure the goodness between the most suitable policy combinations. When the preset preference condition is that the maximum value of the target values is taken, the to-be-selected selection operator corresponding to the maximum one or more target values can be determined as the selection operator. When the preset preference condition is that the minimum value of the target values is taken, the to-be-selected selection operator corresponding to the minimum one or more target values can be determined as the selection operator.
The execution of the steps S210-S240 determines the most suitable selection operator under the target condition and the constraint condition. In determining the most suitable selection operator, the most suitable combination of strategies is actually also determined from the first strategy ensemble Z1. Therefore, through the execution of the steps S210-S230, the most suitable strategy combination can be determined from the first strategy total set Z1 under the target condition and the constraint condition, and the most suitable strategy combination is used for identifying the risk user. Therefore, the embodiment of the optimization method of the policy combination shown in fig. 3 can be obtained based on the embodiment shown in fig. 2.
Fig. 3 is a flowchart illustrating an optimization method of a policy combination according to an embodiment. The method is used to select a combination of policies from a first aggregate set of policies Z1 using a selection operator. The selection operator is used for determining the identification effect scores of the strategy combination aiming at the identification results of the multiple risk user samples, and the strategy combination is used for identifying the risk users. The method can be executed by a computing device and comprises the following steps S310-S340.
Step S310, determining the target conditions and the constraint conditions when selecting the strategy combination. The target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
step S320, constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined;
step S330, determining multiple groups of coefficient values of coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from a first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination;
step S340, when the target values corresponding to the multiple sets of policy combinations are obtained, determining the policy combination corresponding to the target value meeting the preset optimization condition as the optimized policy combination.
In this embodiment, the execution processes of steps S310 to S330 are completely the same as the execution processes of steps S210 to S230 in fig. 2, and specific description may refer to the embodiment shown in fig. 2, which is not repeated herein.
In step S340, there may be more or one target value satisfying the preset preferred condition, and there may be one or more optimized strategy combinations determined. After determining the optimized policy combination, the policy combination may be used to identify risky users in the business data. The business data may include user characteristics of the user, which may have the same attributes as the user characteristics in the multiple at risk user samples described above.
When the preset preference condition is that the target value is the maximum value, the policy combination corresponding to the maximum one or more target values may be determined as the optimized policy combination. When the preset preference condition is that the minimum value of the target values is taken, the strategy combination corresponding to the minimum one or more target values can be determined as the optimized strategy combination.
The embodiment of the method shown in fig. 2 determines the most suitable selection operator using the first aggregate set of policies Z1, and the selection operator can also be applied to determine the optimized combination of policies from the second aggregate set of policies Z2. The target conditions and the constraints when selecting the combination of policies from the second policy aggregate Z2 are the same as the target conditions and the constraints in step S210, respectively. Accordingly, the present specification also provides the method embodiment shown in fig. 4.
Fig. 4 is a flowchart illustrating an optimization method of a policy combination according to an embodiment. The method is executed by a computing device and comprises the following steps S410-S430.
And step S410, acquiring a second strategy total set Z2 to be optimized. The second policy aggregate Z2 is a new policy aggregate constructed to identify an risky user from a plurality of risky user samples.
Step S420, obtaining the selection operator determined by the method provided in the embodiment shown in fig. 2.
And step S430, selecting the strategy combination with the highest recognition effect score from the second strategy total set Z2 by using a plurality of risk user samples and the determined selection operator as the optimized strategy combination. The user characteristics of the multiple risky user samples may be the same as in the embodiment shown in fig. 2.
In executing step S430, the method for determining the policy combination with the highest recognition effect score provided in step S230 may be performed, and will not be described in detail here.
The above embodiments are primarily directed to the introduction of policy combining and selection operators applied to risk user identification scenarios. The applicant finds, through research, that the above determination method of the selection operator and the optimization method of the strategy combination can also be applied to other specified recognition task scenarios. For example, the method is applied to scenes such as maximum distribution coverage, equipment anomaly detection problems and the like. Among them, the maximum delivery coverage problem aims at selecting a subset of riders to cover more delivery areas, and the equipment abnormality detection problem aims at detecting a larger number of abnormal equipments. In this regard, the present specification also provides an embodiment that is applied to a scenario in which a specified recognition task is performed.
Fig. 5 is a flowchart illustrating a method for determining a selection operator in a policy combination according to an embodiment. The selection operator is used for determining the identification effect scores of the identification results of the strategy combination aiming at the plurality of task labeling samples when the strategy combination is selected from the first strategy total set, and the strategy combination is used for executing the specified identification task. For example, when the identification task is designated as device anomaly detection, a certain policy may be IF < device temperature above 200 degrees celsius > AND < rotational speed below 10r/s > THEN < device anomaly >. The method is performed by a computing device, comprising:
step S510, determining a target condition and a constraint condition when selecting a policy combination. The target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition.
For example, in a device anomaly detection scenario, the task annotation sample includes a parametric characteristic of the device. The target value may include the number of detected abnormal devices, and the preset preferable condition includes that the number of the abnormal devices takes a maximum value; the constraint value may include the detected number of normal devices, and the preset limit condition includes a minimum value of the number of normal devices.
In the maximum delivery coverage scene, the strategy included in the strategy aggregate can be a rider, the task labeling sample includes the delivery range of each rider, different riders correspond to different delivery coverage ranges, the strategy combination represents the combination of the riders, the target condition is that the delivery coverage range of the rider combination in the strategy combination is the maximum value (the larger the better), and the constraint condition is that the number of riders in the strategy combination is the minimum value (the smaller the better).
Step S520, constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined.
Specifically, a plurality of base operators may be constructed based on the target value and the constraint value, and the plurality of base operators are combined based on the coefficient to be determined, which is allocated to each base operator, to obtain the selected operator to be determined.
Step S530, determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, labeling the samples and the to-be-selected operators by using multiple tasks aiming at any one to-be-selected operator, selecting the strategy combination with the highest recognition effect score from the first strategy total set, and determining the target value of the recognition result corresponding to the strategy combination as the target value corresponding to the to-be-selected operator.
In the maximum distribution coverage scene, when the steps of selecting a strategy combination with the highest recognition effect score from the first strategy total set by using a plurality of task marking samples and the candidate selection operators according to any one candidate selection operator and determining a target value of a recognition result corresponding to the strategy combination are executed, determining a strategy combination, namely determining a rider combination, and determining the recognition effect score according to the distribution coverage range of the rider combination by using the candidate selection operator. The target value of the combination of strategies is the total delivery coverage of the rider combination.
And step S540, when multiple groups of operators to be selected and corresponding target values are obtained, determining the operators to be selected corresponding to the target values meeting the preset optimal conditions as the selection operators.
The execution of this embodiment may be performed with reference to the description of each step in the embodiment shown in fig. 2, and details are not described here. The implementation of the embodiment shown in fig. 5 is actually a process of determining a more reasonable policy combination, and the modification of step S540 is that, when multiple sets of policy combinations and corresponding target values are obtained, the policy combination corresponding to the target value that meets the preset optimization condition is determined as the optimized policy combination. The embodiment shown in fig. 5 can be converted into an optimization method of a policy combination by modifying step S540.
In this specification, the word "first" in the first set of policies and the like, and the word "second" in the text are used for convenience of distinction and description only, and do not have any limiting meaning.
The foregoing describes certain embodiments of the present specification, and other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily have to be in the particular order shown or in sequential order to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Fig. 6 is a schematic block diagram of a selection operator determining apparatus in a policy combination according to an embodiment. The selection operator is used for determining the identification effect scores of the identification results of the strategy combination aiming at the multiple risk user samples when the strategy combination is selected from the first strategy aggregate, and the strategy combination is used for identifying the risk users. This embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2. The apparatus 600 comprises:
A first determining module 610 configured to determine target conditions and constraint conditions when selecting a policy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module 620 configured to construct a selection operator to be determined based on the target condition and the constraint condition, where the selection operator includes a plurality of base operators and corresponding coefficients to be determined;
a second determining module 630, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition score from the first policy aggregate by using multiple risk user samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
the third determining module 640 is configured to, when multiple sets of candidate operators and corresponding target values are obtained, determine the candidate operator corresponding to the target value meeting the preset optimal condition as the selected operator.
In an embodiment, the first building module 620 is specifically configured to:
constructing a plurality of base operators based on the target values in the target conditions and the constraint values in the constraint conditions;
and combining the plurality of base operators based on the coefficient to be determined distributed to each base operator to obtain the selected operator to be determined.
In one embodiment, the second determining module 630 includes:
the candidate sub-module 631 is configured to determine a plurality of sets of coefficient values of the coefficients to obtain corresponding candidate operators;
the determining sub-module 632 is configured to, for any one candidate selection operator, select, by using a plurality of risk user samples and the candidate selection operator, a policy combination with the highest recognition score from the first policy aggregate, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the candidate selection operator;
the sub-module to be selected 631 is specifically configured to determine a group of coefficient values of the coefficients to obtain corresponding operators to be selected;
the second determining module 630 further includes:
an updating sub-module (not shown in the figure), configured to, after determining the target value corresponding to the operator to be selected, input the target value and the corresponding set of coefficient values into a bayesian model, determine an updated set of coefficient values through the bayesian model, and return to execute the determining sub-module 632.
In one embodiment, the determining module 630, when determining the values of the multiple groups of coefficients, includes:
and determining multiple groups of coefficient values of the coefficients by utilizing a random search algorithm or a grid search algorithm.
In one embodiment, the second determining module 630, when selecting the policy combination with the highest recognition score from the first policy aggregate by using a plurality of risk user samples and the candidate selection operator, includes:
a combination submodule (not shown in the figure) configured to determine a plurality of sets of policy combinations from the first aggregate of policies;
a scoring submodule (not shown in the figure) configured to, for any group of policy combinations, determine the recognition results of the group of policy combinations for multiple risk user samples, and determine the recognition effect scores of the recognition results by using the to-be-selected selection operator;
and a selecting sub-module (not shown in the figure) configured to determine, when the plurality of sets of policy combinations and corresponding recognition effect scores are obtained, the policy combination corresponding to the highest recognition effect score as the selected policy combination.
In one embodiment, the combining submodule is configured to determine an initial plurality of sets of policy combinations from the first policy aggregate;
After the selecting sub-module of the second determining module 630, the method further includes:
an adding sub-module (not shown in the figure), configured to, after determining a selected policy combination from the initial plurality of sets of policy combinations, add a plurality of selectable policies other than the selected policy combination in the first policy total set to the selected policy combination, respectively, obtain updated plurality of sets of policy combinations, and return to execute the scoring sub-module.
In one embodiment, any one of the risk user patterns includes user characteristics of the corresponding user, and the policy in any one of the policy combinations includes: a discrimination condition set based on the user characteristics, and a risk discrimination result when the discrimination condition is satisfied.
In one embodiment, the target value includes the number of identified risky users, and the preset preferred condition includes a maximum value of the number of the risky users; the constraint value comprises the number of the identified non-risk users, and the preset limiting condition comprises the minimum value of the number of the non-risk users.
In one embodiment, the target value comprises an abnormal transaction amount of the identified risky user, and the preset preferred condition comprises that the abnormal transaction amount takes a maximum value; the constraint value comprises a normal transaction amount of the identified risk user, and the preset limiting condition comprises that the normal transaction amount is the minimum value.
Fig. 7 is a schematic block diagram of an optimization apparatus for policy combination according to an embodiment. The device is used for selecting a strategy combination from the first strategy aggregate by utilizing a selection operator, wherein the selection operator is used for determining the identification effect scores of the strategy combination aiming at the identification results of the multiple risk user samples, and the strategy combination is used for identifying the risk users. This embodiment of the device corresponds to the embodiment of the method shown in fig. 3. The apparatus 700 comprises:
a first determining module 710 configured to determine a target condition and a constraint condition when selecting a policy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first constructing module 720, configured to construct a selection operator to be determined based on the target condition and the constraint condition, where the selection operator includes a number of base operators and corresponding coefficients to be determined;
a second determining module 730, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected selection operators, select, for any one of the to-be-selected selection operators, a policy combination with the highest recognition effect score from the first policy total set by using multiple risk user samples and the to-be-selected selection operator, and determine a target value of a recognition result corresponding to the policy combination;
The first optimization module 740 is configured to, when target values corresponding to multiple sets of policy combinations are obtained, determine a policy combination corresponding to a target value that meets the preset optimal condition as an optimized policy combination.
Fig. 8 is a schematic block diagram of an optimization apparatus for policy combination according to an embodiment. This embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 4. The apparatus 800 comprises:
a first obtaining module 810 configured to obtain a second policy aggregate to be optimized;
a second obtaining module 820 configured to obtain the selection operator determined in the embodiment of the method shown in fig. 2;
and a second optimization module 830 configured to select, as an optimized policy combination, a policy combination with a highest recognition score from the second policy aggregate by using a plurality of risk user samples and the determined selection operator.
Fig. 9 is a schematic block diagram of a selection operator determining apparatus in a policy combination according to an embodiment. The selection operator is used for determining the identification effect scores of the identification results of the strategy combination aiming at the plurality of task labeling samples when the strategy combination is selected from the first strategy total set, and the strategy combination is used for executing the specified identification task. This embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 5. The apparatus 900 comprises:
A first determining module 910, configured to determine a target condition and a constraint condition when selecting a policy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module 920, configured to construct a selection operator to be determined based on the target condition and the constraint condition, where the selection operator includes a number of base operators and corresponding coefficients to be determined;
a fourth determining module 930, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with a highest recognition score from the first policy aggregate by using multiple task labeling samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
the third determining module 940 is configured to determine, as a selection operator, a candidate selection operator corresponding to the target value that meets the preset preferred condition when multiple sets of candidate selection operators and corresponding target values are obtained.
The apparatuses provided in the foregoing apparatus embodiments may be deployed in a computing device, and the computing device may be implemented by any apparatus, device, platform, device cluster, and the like having computing and processing capabilities. The above device embodiments correspond to the method embodiments, and for specific description, reference may be made to the description of the method embodiments, which is not described herein again. The device embodiment is obtained based on the corresponding method embodiment, has the same technical effect as the corresponding method embodiment, and for the specific description, reference may be made to the corresponding method embodiment.
Embodiments of the present specification also provide a computer-readable storage medium having a computer program stored thereon, which, when executed in a computer, causes the computer to perform the method of any one of fig. 1 to 5.
The present specification also provides a computing device, including a memory and a processor, where the memory stores executable code, and the processor executes the executable code to implement the method described in any one of fig. 1 to 5.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the storage medium and the computing device embodiments, since they are substantially similar to the method embodiments, they are described relatively simply, and reference may be made to some descriptions of the method embodiments for relevant points.
Those skilled in the art will recognize that the functionality described in embodiments of the invention may be implemented in hardware, software, firmware, or any combination thereof, in one or more of the examples described above. When implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
The above-mentioned embodiments further describe the objects, technical solutions and advantages of the embodiments of the present invention in detail. It should be understood that the above description is only exemplary of the embodiments of the present invention, and is not intended to limit the scope of the present invention, and any modification, equivalent replacement, or improvement made on the basis of the technical solutions of the present invention should be included in the scope of the present invention.

Claims (18)

1. A selection operator determination method in strategy combination is used for determining identification effect scores of identification results of strategy combination aiming at a plurality of risk user samples when the strategy combination is selected from a first strategy total set, and the strategy combination is used for identifying risk users; the method comprises the following steps:
Determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined, and the coefficients are the proportion of the corresponding base operators in the selection operator;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination as a target value corresponding to the to-be-selected operator;
and when a plurality of groups of operators to be selected and corresponding target values are obtained, determining the operators to be selected corresponding to the target values meeting the preset optimal conditions as the selection operators.
2. The method of claim 1, the step of constructing a selection operator to be determined based on the target conditions and constraint conditions, comprising:
Constructing a plurality of base operators based on the target values in the target conditions and the constraint values in the constraint conditions;
and combining the plurality of base operators based on the coefficient to be determined distributed to each base operator to obtain the selected operator to be determined.
3. The method of claim 1, the step of determining a plurality of sets of coefficient values for the coefficients comprising:
determining a group of coefficient values of the coefficients to obtain corresponding operators to be selected;
after determining the target value corresponding to the candidate selection operator, the method further includes:
and inputting the target value and the corresponding group of coefficient values into a Bayesian model, determining an updated group of coefficient values through the Bayesian model, and returning to execute the step of selecting the strategy combination with the highest recognition effect score from the first strategy total set by using the plurality of risk user samples and the to-be-selected operator.
4. The method of claim 1, the step of determining a plurality of sets of coefficient values for the coefficients comprising:
and determining multiple groups of coefficient values of the coefficients by utilizing a random search algorithm or a grid search algorithm.
5. The method of claim 1, wherein the step of selecting the highest recognition score policy combination from the first set of policies using the plurality of risky user samples and the candidate selector comprises:
Determining a plurality of groups of policy combinations from the first aggregate of policies;
determining the identification results of the strategy combination aiming at a plurality of risk user samples aiming at any group of strategy combinations, and determining the identification effect scores of the identification results by using the to-be-selected selection operator;
and when the multiple groups of strategy combinations and the corresponding recognition effect scores are obtained, determining the strategy combination corresponding to the highest recognition effect score as the selected strategy combination.
6. The method of claim 5, the step of determining a plurality of sets of policy combinations from the first aggregate set of policies comprising:
determining an initial plurality of sets of policy combinations from the first set of policies;
after determining the selected policy combination from the initial plurality of sets of policy combinations, further comprising:
and adding a plurality of selectable strategies except the selected strategy combination in the first strategy total set to the selected strategy combination respectively to obtain a plurality of updated strategy combinations, and returning to execute the step of determining the identification results of the strategy combination for a plurality of risk user samples aiming at any group of strategy combinations.
7. The method of claim 1, wherein any one of the at-risk user patterns comprises user characteristics of the corresponding user, and the policies in any one of the policy combinations comprise: a discrimination condition set based on the user characteristics, and a risk discrimination result when the discrimination condition is satisfied.
8. The method of claim 1, wherein the target value comprises a number of identified risky users, and the preset preference condition comprises a maximum value of the number of risky users; the constraint value comprises the number of the identified non-risk users, and the preset limiting condition comprises the minimum value of the number of the non-risk users.
9. The method of claim 1, the target value comprising an abnormal transaction amount for the identified at-risk user, the preset preferred condition comprising the abnormal transaction amount taking a maximum value; the constraint value comprises a normal transaction amount of the identified risk user, and the preset limiting condition comprises that the normal transaction amount is the minimum value.
10. A method for optimizing a policy combination for selecting a policy combination from a first aggregate of policies using a selection operator for determining recognition effectiveness scores of the policy combination for recognition results of a plurality of at-risk user samples, the policy combination for recognizing at-risk users, the method comprising:
determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
Constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of basic operators and corresponding coefficients to be determined, and the coefficients are the proportion of the corresponding basic operators in the selection operator;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing multiple risk user samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination;
and when the target values corresponding to the multiple groups of strategy combinations are obtained, determining the strategy combination corresponding to the target value meeting the preset optimal condition as the optimized strategy combination.
11. A method for optimizing a policy combination includes:
acquiring a second strategy aggregate to be optimized;
obtaining a selection operator as defined in claim 1;
and selecting the strategy combination with the highest recognition effect score from the second strategy total set as the optimized strategy combination by utilizing a plurality of risk user samples and the determined selection operator.
12. A selection operator determination method in strategy combination is used for determining identification effect scores of identification results of strategy combination aiming at a plurality of task labeling samples when the strategy combination is selected from a first strategy total set, and the strategy combination is used for executing a specified identification task; the method comprises the following steps:
Determining a target condition and a constraint condition when selecting the strategy combination; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
constructing a selection operator to be determined based on the target condition and the constraint condition, wherein the selection operator comprises a plurality of base operators and corresponding coefficients to be determined, and the coefficients are the proportion of the corresponding base operators in the selection operator;
determining multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, selecting a strategy combination with the highest recognition effect score from the first strategy total set by utilizing a plurality of task marking samples and the to-be-selected operators aiming at any one to-be-selected operator, and determining a target value of a recognition result corresponding to the strategy combination as a target value corresponding to the to-be-selected operator;
and when a plurality of groups of operators to be selected and corresponding target values are obtained, determining the operators to be selected corresponding to the target values meeting the preset optimal conditions as the selection operators.
13. A selection operator determination device in a strategy combination, wherein the selection operator is used for determining the identification effect scores of the strategy combination aiming at the identification results of a plurality of risk user samples when the strategy combination is selected from a first strategy total set, and the strategy combination is used for identifying risk users; the device comprises:
The first determining module is configured to determine a target condition and a constraint condition when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module configured to construct a selection operator to be determined based on the target condition and the constraint condition, where the selection operator includes a number of base operators and corresponding coefficients to be determined, and the coefficients are the ratios of the corresponding base operators in the selection operator;
a second determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition score from the first policy aggregate by using multiple risk user samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
and the third determining module is configured to determine the candidate selection operator corresponding to the target value meeting the preset optimal condition as the selection operator when multiple groups of candidate selection operators and corresponding target values are obtained.
14. An apparatus for optimizing a policy combination for selecting a policy combination from a first aggregate of policies using a selection operator for determining recognition effectiveness scores of the policy combination for recognition results of a plurality of at-risk user samples, the policy combination for recognizing at-risk users, the apparatus comprising:
the first determining module is configured to determine target conditions and constraint conditions when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
a first construction module configured to construct a selection operator to be determined based on the target condition and the constraint condition, where the selection operator includes a number of base operators and corresponding coefficients to be determined, and the coefficients are the ratios of the corresponding base operators in the selection operator;
a second determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition effect score from the first policy aggregate by using multiple risk user samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination;
And the first optimization module is configured to determine the strategy combination corresponding to the target value meeting the preset optimal condition as the optimized strategy combination when the target values corresponding to the multiple groups of strategy combinations are obtained.
15. An apparatus for optimizing a combination of policies, comprising:
the first obtaining module is configured to obtain a second strategy aggregate to be optimized;
a second obtaining module configured to obtain the selection operator determined in claim 1;
and the second optimization module is configured to select the strategy combination with the highest recognition effect score from the second strategy total set as the optimized strategy combination by utilizing a plurality of risk user samples and the determined selection operator.
16. A selection operator determination device in a policy combination, the selection operator being used for determining an identification effect score of an identification result of the policy combination for a plurality of task labeling samples when the policy combination is selected from a first policy aggregate, the policy combination being used for executing a specified identification task; the device comprises:
the first determining module is configured to determine a target condition and a constraint condition when the strategy combination is selected; the target condition comprises that a target value aiming at the recognition result meets a preset optimal condition, and the constraint condition comprises that a constraint value aiming at the recognition result meets a preset limiting condition;
A first construction module configured to construct a selection operator to be determined based on the target condition and constraint condition, where the selection operator includes several base operators and corresponding coefficients to be determined, and the coefficients are the ratios of the corresponding base operators in the selection operator;
a fourth determining module, configured to determine multiple groups of coefficient values of the coefficients to obtain corresponding to-be-selected operators, select, for any one of the to-be-selected operators, a policy combination with the highest recognition effect score from the first policy aggregate by using multiple task labeling samples and the to-be-selected operator, and determine a target value of a recognition result corresponding to the policy combination as a target value corresponding to the to-be-selected operator;
and the third determining module is configured to determine the to-be-selected operators corresponding to the target values meeting the preset optimal conditions as the selection operators when multiple groups of to-be-selected operators and corresponding target values are obtained.
17. A computer-readable storage medium, on which a computer program is stored which, when executed in a computer, causes the computer to carry out the method of any one of claims 1-12.
18. A computing device comprising a memory having executable code stored therein and a processor that, when executing the executable code, implements the method of any of claims 1-12.
CN202210405947.5A 2022-04-18 2022-04-18 Selection operator determining method, strategy combination optimizing method and device Active CN114548830B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210405947.5A CN114548830B (en) 2022-04-18 2022-04-18 Selection operator determining method, strategy combination optimizing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210405947.5A CN114548830B (en) 2022-04-18 2022-04-18 Selection operator determining method, strategy combination optimizing method and device

Publications (2)

Publication Number Publication Date
CN114548830A CN114548830A (en) 2022-05-27
CN114548830B true CN114548830B (en) 2022-07-29

Family

ID=81667484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210405947.5A Active CN114548830B (en) 2022-04-18 2022-04-18 Selection operator determining method, strategy combination optimizing method and device

Country Status (1)

Country Link
CN (1) CN114548830B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440540A (en) * 2013-09-17 2013-12-11 武汉大学 Parallelization method of artificial immune optimization model for spatial layout of land utilization
CN110929960A (en) * 2019-12-12 2020-03-27 支付宝(杭州)信息技术有限公司 Policy selection optimization method and device
CN113419853A (en) * 2021-06-22 2021-09-21 中国工商银行股份有限公司 Task execution strategy determining method and device, electronic equipment and storage medium
CN114066196A (en) * 2021-11-08 2022-02-18 国网湖北省电力有限公司经济技术研究院 Power grid investment strategy optimization system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9492923B2 (en) * 2014-12-16 2016-11-15 Amazon Technologies, Inc. Generating robotic grasping instructions for inventory items
US10311467B2 (en) * 2015-03-24 2019-06-04 Adobe Inc. Selecting digital advertising recommendation policies in light of risk and expected return
CN111310993A (en) * 2020-02-11 2020-06-19 苏宁金融科技(南京)有限公司 Project configuration method, device and system based on multi-objective evolutionary algorithm
CN114186633B (en) * 2021-12-10 2023-04-07 北京百度网讯科技有限公司 Distributed training method, device, equipment and storage medium of model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103440540A (en) * 2013-09-17 2013-12-11 武汉大学 Parallelization method of artificial immune optimization model for spatial layout of land utilization
CN110929960A (en) * 2019-12-12 2020-03-27 支付宝(杭州)信息技术有限公司 Policy selection optimization method and device
CN113419853A (en) * 2021-06-22 2021-09-21 中国工商银行股份有限公司 Task execution strategy determining method and device, electronic equipment and storage medium
CN114066196A (en) * 2021-11-08 2022-02-18 国网湖北省电力有限公司经济技术研究院 Power grid investment strategy optimization system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
"基于多约束目标的旅游路线推荐及关键算法研究";陆国锋;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20160315;I138-8020 *
Peng Xiao ; Mingyue Liu等."Approach for Multi-Attribute Decision Making Based on Gini Aggregation Operator and Its Application to Carbon Supplier Selection".《IEEE Access》.2019, *
改进选择策略的有约束多目标优化算法;杨景明等;《高技术通讯》;20191215(第12期);第1193-1200页 *

Also Published As

Publication number Publication date
CN114548830A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
US20230067128A1 (en) Prioritizing security controls using a cyber digital twin simulator
US8175992B2 (en) Methods and systems for compound feature creation, processing, and identification in conjunction with a data analysis and feature recognition system wherein hit weights are summed
Li et al. A systematic approach to heterogeneous multiattribute group decision making
US8965896B2 (en) Document clustering system, document clustering method, and recording medium
US7966331B2 (en) Method and system for assessing and optimizing crude selection
US20020082778A1 (en) Multi-term frequency analysis
Wu et al. Making paper reviewing robust to bid manipulation attacks
Sbihi A cooperative local search-based algorithm for the multiple-scenario max–min knapsack problem
CN114548830B (en) Selection operator determining method, strategy combination optimizing method and device
CN112801231B (en) Decision model training method and device for business object classification
Neshatian et al. Genetic programming for feature subset ranking in binary classification problems
CN114492214B (en) Method and device for determining selection operator and optimizing strategy combination by using machine learning
US20210174228A1 (en) Methods for processing a plurality of candidate annotations of a given instance of an image, and for learning parameters of a computational model
CN115587884A (en) User loan default prediction method based on improved extreme learning machine
Echeberria-Barrio et al. Deep learning defenses against adversarial examples for dynamic risk assessment
US11307867B2 (en) Optimizing the startup speed of a modular system using machine learning
CN113205185A (en) Network model optimization method and device, computer equipment and storage medium
Tomášek et al. Using one-sided partially observable stochastic games for solving zero-sum security games with sequential attacks
Khoshgoftaar et al. A Comparative Study of Different Strategies for Predicting Software Quality.
CN116362886A (en) Strategy subset selection method and device under multiple evaluation indexes
CN114978616B (en) Construction method and device of risk assessment system, and risk assessment method and device
US20210326664A1 (en) System and Method for Improving Classification in Adversarial Machine Learning
Kumar et al. Smart Contract Security: A Review with a Focus on Decentralized Finance
CN114418772A (en) Optimization method and device of strategy combination
CN114493885A (en) Optimization method and device of strategy combination

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant