CN112785144A

CN112785144A - Model construction method, device and storage medium based on federal learning

Info

Publication number: CN112785144A
Application number: CN202110067286.5A
Authority: CN
Inventors: 王洵湉; 李月
Original assignee: WeBank Co Ltd
Current assignee: WeBank Co Ltd
Priority date: 2021-01-18
Filing date: 2021-01-18
Publication date: 2021-05-11
Anticipated expiration: 2041-01-18
Also published as: CN112785144B

Abstract

The present application discloses a model construction method, device and storage medium based on federated learning. The method includes: when an evaluation model construction instruction is detected, determining a business scenario and a business goal based on the evaluation model construction instruction; Business scenarios and business goals, determine target data, perform feature mining and feature selection on the target data, and obtain target feature data; based on the target feature data, execute a preset federation process to achieve collaboration with other second parties. Federated modeling and get the target model. In this application, according to specific business scenarios and business goals, feature mining and feature selection are carried out to accurately obtain target feature data that is consistent with the actual business. The other second parties federate the modeling and obtain the target model, thus improving the evaluation accuracy of enterprises with actual business usage scenarios.

Description

Model construction method, device and storage medium based on federal learning

Technical Field

The application relates to the field of artificial intelligence, in particular to a method, equipment and a storage medium for model construction based on federal learning.

Background

With the continuous development of financial science, particularly internet science and technology, more and more technologies (such as distributed, Blockchain, artificial intelligence and the like) are applied to the financial field, but the financial industry also puts higher requirements on the technologies, and for example, the financial industry also has higher requirements on the construction of an evaluation model for enterprise evaluation.

With the state supporting the medium and small enterprises vigorously, the evaluation model aiming at enterprise evaluation is generated, and the existing service providers providing the evaluation model have the following characteristics: the open data adopted when the model is built by the sky-eye search is from the approaches of a national enterprise credit information public system, a national referee document network, a national execution information public network, a national intellectual property bureau trademark office, a copyright bureau and the like, although the data for building the model are open, the evaluation factors and algorithms when the model is built by the sky-eye search enterprise are not disclosed, and for an organization with an actual service use scene, the problems of inconsistent scene and inaccurate evaluation exist.

Disclosure of Invention

The application mainly aims to provide a method, equipment and a storage medium for model construction based on federal learning, and aims to solve the technical problems that a unified assessment model is adopted for each enterprise at present, and scene inconsistency and inaccurate assessment are prone to exist in practical application.

In order to achieve the above object, the present application provides a model building method based on federal learning, which is applied to a first participant, and the model building method based on federal learning includes:

when an evaluation model building instruction is detected, determining a service scene and a service target based on the evaluation model building instruction;

determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data;

and based on the target characteristic data, performing a preset federal flow to realize federal modeling with other second participants and obtain a target model.

Optionally, the step of determining target data based on the service scenario and the service target, and performing feature mining and feature selection on the target data to obtain target feature data includes:

selecting a characteristic factor from a preset characteristic factor library based on the service scene and the service target;

selecting target data matched with the attribute of the characteristic factor from preset self data and public data;

carrying out data preprocessing on the target data to obtain preprocessed data;

and carrying out self-owned feature mining and self-owned feature selection on the preprocessed data, and carrying out public feature mining and public feature selection on the preprocessed data to obtain target feature data.

Optionally, after the step of performing a preset federal procedure to federate with other second participants and obtain a target model by executing the preset federal procedure based on the target feature data, the method includes:

evaluating the stability and accuracy of the model of the target model through preset evaluation data, and determining whether the target model passes model evaluation;

and if the target model passes model evaluation, inputting the enterprise data to be processed into the target model to obtain a grading result of the enterprise.

Optionally, if the target model passes through model evaluation, the step of inputting the to-be-processed enterprise data into the target model to obtain a scoring result of the enterprise includes:

if the target model passes the model evaluation, inputting the enterprise data to be processed of the target enterprise into the target model to obtain the model score of the target enterprise;

and acquiring the third party score of the target enterprise, and combining the model score and the third party score to obtain the scoring result of the enterprise.

Optionally, the step of evaluating the stability and accuracy of the model of the target model by using preset evaluation data, and determining whether the target model passes the model evaluation includes:

performing model disturbance on the target model through disturbance data in preset evaluation data, and determining a first model error between a first prediction result obtained after disturbance and a first preset result;

if the error of the first model is smaller than a first preset value, determining that the target model passes stability evaluation;

performing model accuracy verification on the target model through accuracy evaluation data in preset evaluation data, and determining a second model error between a second prediction result obtained after verification and a second preset result;

and if the error of the second model is smaller than a second preset value, determining that the target model passes the accuracy evaluation.

Optionally, the step of performing a preset federal procedure based on the target feature data to realize federal modeling with other second participants and obtain a target model includes:

acquiring a preset model to be trained, and acquiring a preset training completion condition of the preset model to be trained;

and optimizing and updating model calculation intermediate variables of the preset model to be trained by combining other second participants based on the target characteristic data until the preset model to be trained meets preset training completion conditions to obtain the target model.

Optionally, the step of obtaining the target model by performing optimization updating on the model calculation intermediate variable of the preset model to be trained in combination with other second participants based on the target feature data until the preset model to be trained satisfies a preset training completion condition includes:

inputting the target characteristic data into a preset model to be trained, performing prediction processing on the target characteristic data based on the preset model to be trained to obtain a third prediction result, determining a third model error between the third prediction result and a third preset result of the target characteristic data, updating the preset model to be trained based on the third model error, and iteratively updating a model calculation intermediate variable of the preset model to be trained;

judging whether the preset model to be trained of iterative training reaches a preset replacement updating condition, if so, performing replacement updating on the model calculation intermediate variable updated by training through executing the preset longitudinal federal learning process to obtain the preset model to be trained which is updated by replacement;

and continuously carrying out iterative training and replacement updating on the preset model to be trained which is subjected to replacement updating until the preset model to be trained meets a preset training completion condition, and obtaining the target model.

The application also provides a target subject evaluation method based on federal learning, which is applied to a first participant, and the target subject evaluation method based on federal learning comprises the following steps:

determining a service scene and a service target from the received evaluation model construction instruction;

carrying out federal modeling with other second participants to obtain a target model based on the target characteristic data;

and grading the enterprise data to be processed by adopting the target model to obtain a grading result of the enterprise.

Optionally, before the step of performing scoring processing on the enterprise data to be processed by using the target subject evaluation model to obtain a scoring result of the enterprise, the method includes:

and if the target model passes the model evaluation, performing grading processing on the enterprise data to be processed by adopting the target model to obtain a grading result of the enterprise.

carrying out data preprocessing on the target data to obtain preprocessed data;

The application also provides a model building device based on federal learning, which is applied to a first participant, and the model building device based on federal learning comprises:

the detection module is used for determining a service scene and a service target based on the evaluation model construction instruction when the evaluation model construction instruction is detected;

the first determining module is used for determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data;

and the federal module is used for realizing the federal modeling with other second participants and obtaining a target model by executing a preset federal flow based on the target characteristic data.

The application also provides a target subject evaluation device based on federal learning, which is applied to a first participant, and the target subject evaluation device based on federal learning comprises:

the receiving module is used for determining a service scene and a service target from the received evaluation model building instruction;

the second determining module is used for determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data;

the first obtaining module is used for carrying out federal modeling with other second participants to obtain a target model based on the target characteristic data;

and the second acquisition module is used for carrying out grading processing on the enterprise data to be processed by adopting the target model to obtain a grading result of the enterprise.

The application also provides a model building device based on federal learning, the model building device based on federal learning is an entity device, and the model building device based on federal learning comprises: a memory, a processor, and a program of the federal learning based model building method stored in the memory and executable on the processor, the program of the federal learning based model building method being executable by the processor to implement the steps of the federal learning based model building method as described above.

The present application also provides a readable storage medium having stored thereon a program for implementing the federal learning based model building method, which when executed by a processor, implements the steps of the federal learning based model building method as described above.

The present application also provides a computer program product, comprising a computer program, which when executed by a processor, performs the steps of the above-described method of model construction based on federated learning.

Compared with the problems that in the prior art, a unified assessment model is adopted for each enterprise, and scene inconsistency and inaccurate assessment are prone to exist in practical application, the method, the equipment and the storage medium for establishing the model based on the federal learning determine a service scene and a service target based on the assessment model establishment instruction when the assessment model establishment instruction is detected; determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data; and based on the target characteristic data, performing a preset federal flow to realize federal modeling with other second participants and obtain a target model. In the application, when an evaluation model construction instruction is detected, a service scene and a service target are determined, then feature mining and feature selection are performed on the specific service scene and the specific service target, target feature data which are consistent with actual services are accurately obtained, then based on the target feature data, a preset federal process is executed, so that federal modeling with other second participants is accurately realized on the basis of massive data, and a target model is obtained, namely the evaluation model is accurately obtained, and therefore the evaluation accuracy of enterprises with actual service use scenes is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.

FIG. 1 is a schematic flow chart of a first embodiment of a model construction method based on federated learning according to the present application;

FIG. 2 is a detailed flowchart of step S10 in the first embodiment of the model construction method based on federated learning according to the present application;

FIG. 3 is a schematic diagram of an apparatus configuration of a hardware operating environment according to an embodiment of the present application;

fig. 4 is a scene schematic diagram of the model construction based on federal learning according to the present application.

The objectives, features, and advantages of the present application will be further described with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

While a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown. The execution subject of each embodiment of the model construction method based on federal learning can be equipment such as a smart phone, a personal computer and a server, and for convenience of description, the following embodiments take a first participant as the execution subject for explanation.

The embodiment of the application provides a model construction method based on federal learning, and in a first embodiment of the model construction method based on federal learning, referring to fig. 1, the model construction method based on federal learning is applied to a first participant, and the model construction method based on federal learning comprises the following steps:

step S10, when an evaluation model building instruction is detected, determining a service scene and a service target based on the evaluation model building instruction;

step S20, determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data;

and step S30, based on the target characteristic data, performing a preset federal flow to realize federal modeling with other second participants and obtain a target model.

The method comprises the following specific steps:

in this embodiment, the federal learning based model building method is applied to a federal learning based model building device in a first participant, the federal learning based model building device belongs to a federal learning based model building system, the federal learning based model building system belongs to a federal learning based model building device, and when an evaluation model building instruction is detected, a service scene and a service target are determined based on the evaluation model building instruction, wherein a trigger mode of the evaluation model building instruction includes:

the first method is as follows: a user triggers an evaluation model building instruction based on a model building interface through clicking, touching or voice and other modes;

the second method comprises the following steps: the method comprises the steps that a trigger program for evaluating and evaluating model building instructions or a trigger module for evaluating and evaluating model building instructions is preset in a first participant, and the trigger program or the trigger module automatically triggers the evaluating and evaluating model building instructions when preset trigger conditions are met.

It should be noted that the model construction system based on the federal learning is provided with a configuration module, and the model construction based on the federal learning can be flexibly performed through the configuration module, wherein the flexible model construction based on the federal learning at least comprises the following conditions:

the first condition is as follows: external data and/or internal data needing to be acquired are flexibly determined through a configuration module;

specifically, for example, the external data to be acquired may be data of a third party such as a national enterprise credit information public system, a national referee document network, a national executive information disclosure network, a national intellectual property bureau trademark office, a copyright bureau, etc., or market public data such as industry large-size data, industry news information, peer competitions, upstream and downstream enterprises, etc., and the internal data to be acquired may be data of liveness within a product, number of products using the company, abnormal behaviors (including risk behaviors, etc.), etc.

Case two: flexibly determining the characteristic indexes to be acquired through a configuration module, and carrying out type division on the characteristic indexes through the configuration module, namely determining a grading modeling item;

specifically, for example, for the financial industry, the scoring modeling items may be financial associations, credit associations, enterprise business items, and the like, and for the patent industry, the scoring modeling items may be talent structure items, business scale items, and the like.

Case three: the configuration module flexibly determines other participants needing to participate in the federation, namely in the embodiment, the participants of the federation can be obtained through configuration.

Specifically, other parties that need to participate in the vertical federation may be determined, for example, based on whether they have the same user.

In this embodiment, the method for flexibly performing the model construction based on the federal learning by setting the configuration module includes:

the first method is as follows: configuration items in the configuration module are constructed in a mode of selecting the interface by a user so as to flexibly construct a model based on federal learning;

the second method comprises the following steps: and constructing configuration items in the configuration module in an importing mode so as to flexibly construct a model based on federal learning.

When an evaluation model building instruction is detected, determining a service scene and a service target based on the evaluation model building instruction, wherein the service scene can refer to a financing scene, a small loan scene, a credit loan scene, a patent service scene and the like, and the service target can be loan or not, financing difficulty or easiness, or patent litigation and the like, wherein the service scene and the service target are carried in the evaluation model building instruction.

and determining target data based on the service scene and the service target, specifically, determining a data source party based on the service scene and the service target, selecting target data from the data source party based on the service scene and the service target, performing feature mining and feature selection on the target data after determining the target data to obtain target feature data, and specifically, constructing an instruction based on an evaluation model, and performing feature mining and feature selection on the target data to obtain the target feature data.

The step of determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data includes:

step S21, selecting characteristic factors from a preset characteristic factor library based on the service scene and the service target;

in this embodiment, after obtaining the service scenario and the service objective, based on the service scenario and the service objective, feature factors are selected from a preset feature factor library (including all banking factors), or all feature factors are directly selected, specifically, the feature factors may be basic credit score factors of a third party, industry trend feature factors, and product performance feature factors, where the feature factors may be modeling items or may be superior concepts of the modeling items.

For example, the service scene is a credit investigation and loan scene, the service objective is to determine whether to issue credit investigation and loan, specifically, the credit investigation and loan scene and the feature factors corresponding to whether to issue credit investigation and loan, such as credit score factor, expected loan factor, deposit income factor, academic experience factor, etc., are selected from the preset feature factor library.

Step S22, selecting target data matched with the attribute of the characteristic factor from preset own data and public data;

selecting target data matched with the attribute of the characteristic factor from preset self-contained data and public data, specifically, selecting first target subdata matched with the attribute of the basic credit score factor from preset public data (preset third-party data) based on the basic credit score factor, selecting second target subdata matched with the attribute of the industry trend characteristic factor from preset public data based on the industry trend characteristic factor, and selecting third target subdata matched with the attribute of the product performance characteristic factor from preset self-contained data based on the product performance characteristic factor, wherein the attribute matching may refer to: first, the target (sub) data has the name or keyword corresponding to the feature factor, and second, the target data has the data content (word Cao value) corresponding to the feature factor. And combining the first target subdata, the second target subdata and the third target subdata to obtain target data.

In this embodiment, based on the service scene and the service target, the feature factor is selected from the preset feature factor library, and then the target data is selected from the preset own data and the public data to perform subsequent preprocessing, instead of performing subsequent preprocessing directly based on the preset own data and the public data, so that the data processing amount can be reduced, and the data processing efficiency can be improved.

Step S23, carrying out data preprocessing on the target data to obtain preprocessed data;

and carrying out data preprocessing on the target data to obtain preprocessed data, wherein the preprocessing mode comprises the following steps: abnormal value correction, missing value removal processing, and the like. In the present embodiment, the purpose of the preprocessing is to: and the influence of the actual characteristics on the model training result is avoided.

And step S24, performing self-owned feature mining and self-owned feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data to obtain target feature data.

And performing self-feature mining and self-feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data, specifically, performing forward self-feature mining and forward self-feature selection on the preprocessed data, and performing forward public feature mining and forward public feature selection on the preprocessed data to obtain target feature data, wherein a forward public feature refers to a forward public feature in public feature data having an influence or intention on a business target, and a forward self-feature refers to a forward self-feature in the public feature data having an influence or intention on the business target, for example, if a conversion rate of a certain platform advertisement is to be obtained, the forward feature is a feature in the collected user data after watching the advertisement, but not a feature in all user data. In this embodiment, it should be noted that feature mining refers to obtaining model building item data, and feature selection refers to selecting required feature data from the model building item data, where the feature data and the feature selection may involve processing modes such as natural language processing, unstructured data processing, customer behavior sequence processing, and industry periodic processing.

In this embodiment, the target feature data is obtained by performing self-feature mining and self-feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data, specifically, the target feature data refers to a model construction item and a specific numerical value of the model construction item, for example, the model construction item is a product activity, and the specific numerical value of the product activity is a 50000 user with daily activity. In this embodiment, the preprocessing data is subjected to self-feature mining and self-feature selection, and the preprocessing data is subjected to public feature mining and public feature selection to obtain target feature data, so that a data training model influencing or having an intention on a business target can be selected, and the accuracy of the model is improved.

After the target characteristic data are obtained, determining other second participants participating in the federation, and implementing the federation modeling with the other second participants and obtaining a target model by executing a preset federation flow based on the target characteristic data.

In this embodiment, after obtaining the target model, the result of enterprise evaluation may be obtained based on the target model, and the evaluation result or output item of enterprise evaluation is a score, a label (e.g., risk user, potential user, high-value user, sleeping user, growing user, etc.), or a ranking, etc.

The step of carrying out a preset federal flow based on the target characteristic data to realize federal modeling with other second participants and obtain a target model comprises the following steps:

step S31, acquiring a preset model to be trained, and acquiring a preset training completion condition of the preset model to be trained;

in this embodiment, before performing federal training, a preset model to be trained needs to be obtained first, and a preset training completion condition of the preset model to be trained is obtained, where the preset model to be trained is a basic model for federal learning.

And step S32, based on the target characteristic data, combining other second participants to optimize and update the model calculation intermediate variable of the preset model to be trained until the preset model to be trained meets a preset training completion condition, and obtaining the target model.

And optimizing and updating model calculation intermediate variables of the preset model to be trained by combining other second participants based on the target characteristic data until the preset model to be trained meets a preset training completion condition to obtain the target model, wherein a preset longitudinal federated learning process is executed based on the target characteristic data to obtain an aggregation model parameter meeting a preset iterative training termination condition to obtain the target model.

The step of obtaining the target model by optimizing and updating the model calculation intermediate variable of the preset model to be trained by combining with other second participants based on the target feature data until the preset model to be trained meets a preset training completion condition includes:

step M1, inputting the target characteristic data into a preset model to be trained, so as to perform prediction processing on the target characteristic data based on the preset model to be trained to obtain a third prediction result, determining a third model error between the third prediction result and a third preset result of the target characteristic data, and updating the preset model to be trained based on the third model error to iteratively update a model calculation intermediate variable of the preset model to be trained;

in this embodiment, the target feature data is input into a preset model to be trained, so as to perform prediction processing on the target feature data based on the preset model to be trained, obtain a third prediction result, determine a third model error between the third prediction result and a third preset result of the target feature data, and update the preset model to be trained based on the third model error, so as to train and update a model calculation intermediate variable of the preset model to be trained.

In particular, the model calculation intermediate variables may be model parameters, or gradients, etc.

Step M2, judging whether the preset model to be trained of iterative training meets a preset replacement updating condition, if the preset model to be trained meets the preset replacement updating condition, replacing and updating the model calculation intermediate variable updated by training through executing the preset longitudinal federal learning process, and obtaining the preset model to be trained which is replaced and updated;

judging whether the preset model to be trained of iterative training reaches a preset replacement updating condition, wherein the preset replacement updating condition can be that the preset model to be trained reaches a first iterative training frequency, such as 1 time or 500 times of iterative training, if the preset replacement updating condition is not reached, returning to the step of inputting the target characteristic data into the preset model to be trained so as to carry out prediction processing on the target characteristic data based on the preset model to be trained to obtain a third prediction result, and determining a third model error between the third prediction result and the third preset result of the target characteristic data until the preset model to be trained after iterative updating reaches the preset replacement updating condition. If the preset model to be trained reaches a preset replacement updating condition, replacing and updating the model calculation intermediate variable updated by training through executing the preset longitudinal federal learning process to obtain the preset model to be trained, wherein the model calculation intermediate variable updated by training is replaced and updated through executing the preset longitudinal federal learning process to obtain the preset model to be trained, and the replacement and update of the preset model to be trained is obtained by: as shown in fig. 4, the model calculation intermediate variables are sent to the intermediate party, so that the intermediate party aggregates the model calculation intermediate variables of each participant to obtain aggregated variables, and then the aggregated variables are sent to each participant to obtain the replacement updated preset model to be trained, so that each participant continues to train and replace the updated preset model to be trained.

And step M3, continuously performing iterative training and replacement updating on the preset model to be trained, which is subjected to replacement updating, until the preset model to be trained meets a preset training completion condition, and obtaining the target model.

The first participant continuously performs iterative training and replacement updating on the replacement updated preset model to be trained until the preset model to be trained meets a preset training completion condition (training may reach a second preset training frequency or preset loss function convergence), and obtains the target model.

After the step of performing a preset federal procedure to realize federal modeling with other second participants and obtain a target model based on the target feature data, the method comprises the following steps:

step S40, carrying out model stability and accuracy evaluation on the target model through preset evaluation data, and determining whether the target model passes model evaluation;

in this embodiment, after a target model is obtained, model stability and accuracy of the target model are evaluated through preset evaluation data, and it is determined whether the target model passes through model evaluation, specifically, model stability of the target model is evaluated through accuracy evaluation data in the preset evaluation data, and it is determined whether the target model passes through model stability evaluation, model accuracy of the target model is evaluated through disturbance data in the preset evaluation data, and it is determined whether the target model passes through model evaluation if the target model passes through model accuracy evaluation and model stability evaluation.

And step S50, if the target model passes the model evaluation, inputting the enterprise data to be processed into the target model to obtain the scoring result of the enterprise.

If the target model is evaluated through a model, the to-be-processed enterprise data is input into the target model to obtain a scoring result (or an evaluation result) of the enterprise, that is, in this embodiment, it should be noted that the to-be-processed enterprise data includes data of a third party, such as credit data, that is, in this embodiment, the target model is a model constructed by internal data and external data (including third party data), and therefore, the to-be-processed enterprise data (including third party data) is input into the target model, and the scoring result of the enterprise can be accurately obtained. That is, in this embodiment, the third party data may be part of the model build item.

Further, based on the first embodiment of the present application, in another embodiment of the present application, the step of evaluating the stability and accuracy of the target model by presetting evaluation data, and determining whether the target model passes the model evaluation includes:

step N1, performing model disturbance on the target model through disturbance data in preset evaluation data, and determining a first model error between a first prediction result obtained after disturbance and a first preset result;

step N2, if the error of the first model is smaller than a first preset value, determining that the target model passes stability evaluation;

in this embodiment, specifically, disturbance data in evaluation data is preset, model disturbance is performed on the target model, disturbance verification data is processed based on the disturbance model obtained after disturbance, a first prediction result obtained after the disturbance model predicts the disturbance verification data is determined, a first preset result of the disturbance verification data is obtained, a first model error between the first prediction result and the first preset result is determined, if the first model error is less than or equal to a first preset value, it is determined that the target model passes stability evaluation, and if the first model error is greater than a first preset value, it is determined that the target model fails stability evaluation.

Step N3, performing model accuracy verification on the target model through accuracy evaluation data in preset evaluation data, and determining a second model error between a second prediction result obtained after verification and a second preset result;

and step N4, if the error of the second model is smaller than a second preset value, determining that the target model passes the accuracy evaluation.

In this embodiment, specifically, model accuracy verification is performed on the target model through accuracy evaluation data in preset evaluation data, a second model error between a second prediction result obtained after verification and a second preset result is determined, if the second model error is less than or equal to a second preset value, it is determined that the target model passes accuracy evaluation, and if the second model error is greater than the second preset value, it is determined that the target model does not pass accuracy evaluation.

In this embodiment, after stability and accuracy of the target model are evaluated, data prediction is performed to ensure that the model is effective and accurate for a long time.

In the embodiment, model disturbance is performed on the target model through disturbance data in preset evaluation data, and a first model error between a first prediction result obtained after disturbance and a first preset result is determined; if the error of the first model is smaller than a first preset value, determining that the target model passes stability evaluation; performing model accuracy verification on the target model through accuracy evaluation data in preset evaluation data, and determining a second model error between a second prediction result obtained after verification and a second preset result; and if the error of the second model is smaller than a second preset value, determining that the target model passes the accuracy evaluation. In this embodiment, before enterprise evaluation, accuracy and stability evaluation is also performed on the target model, so as to ensure accuracy of enterprise evaluation.

Further, based on the first embodiment and the second embodiment in the present application, in another embodiment of the present application, the step of inputting the to-be-processed enterprise data into the target model to obtain a scoring result of the enterprise when the target model passes through model evaluation includes:

step P1, if the target model passes the model evaluation, inputting the enterprise data to be processed of the target enterprise into the target model to obtain the model score of the target enterprise;

and P2, acquiring the third party score of the target enterprise, and combining the model score and the third party score to obtain the scoring result of the enterprise.

In this embodiment, it should be noted that the model construction item may not include data of a third party, for example, does not include credit score or credit assessment score of the user, and at this time, the scoring result of the enterprise is composed of two parts, one part is the model scoring and the other part is the third party scoring, so that, if the target model passes through the model evaluation, the to-be-processed enterprise data of the target enterprise is input into the target model to obtain the model scoring of the target enterprise, after the model scoring is obtained, the third party scoring of the target enterprise is obtained, and the scoring result of the enterprise is obtained by combining the model scoring and the third party scoring, where the combination specific manner may be: and respectively determining the scoring proportion of the model score and the third party score, and calculating to obtain the scoring result of the enterprise based on the scoring proportion of the model score and the third party score.

In this embodiment, if the target model passes model evaluation, to-be-processed enterprise data of a target enterprise is input into the target model, so as to obtain a model score of the target enterprise; and acquiring the third party score of the target enterprise, and combining the model score and the third party score to obtain the scoring result of the enterprise. In the embodiment, the scoring result of the enterprise can be accurately obtained.

Further, based on the first, second, and embodiments, a fourth embodiment of the model construction method based on federal learning according to the present invention is provided, and in this embodiment, the model construction method based on federal learning includes:

step A10, determining a service scene and a service target from the received evaluation model building instruction;

in this embodiment, the service scenario and the service target are determined in advance from the received evaluation model building instruction, and the specific analysis process may refer to the determination process of the service scenario and the service target in the first embodiment.

Step A20, determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data;

after the service scene and the service target are obtained, target data are determined based on the service scene and the service target, and feature mining and feature selection are carried out on the target data to obtain target feature data. The determination may be performed according to the method for determining the target feature data in the first embodiment described above to obtain the target feature data.

Further, the step a20 includes:

step A201, selecting a characteristic factor from a preset characteristic factor library based on the service scene and the service target;

firstly, selecting characteristic factors from a preset characteristic factor library according to a service scene and a service target, namely, firstly, selecting modeling items and the like from the preset characteristic factor library according to the service scene and the service target.

The process of selecting the feature factor from the preset feature factor library according to the service scenario and the service objective may refer to step S21 in the first embodiment.

Step A202, selecting target data matched with the attribute of the characteristic factor from preset self data and public data;

selecting target data matching the attribute of the feature factor from preset own data and public data may refer to step S22 in the above-described first embodiment.

Step A203, performing data preprocessing on the target data to obtain preprocessed data;

the process of obtaining the preprocessed data may refer to step S23 in the first embodiment described above.

Step A204, performing self-owned feature mining and self-owned feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data to obtain target feature data.

The above step S24 in the first embodiment may be referred to for performing self-feature mining and self-feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data to obtain target feature data.

Step A30, carrying out federal modeling with other second participants to obtain a target model based on the target characteristic data;

based on the target feature data, the specific process of obtaining the target model by federally modeling with other second participants can refer to the first embodiment.

And A40, adopting the target model to perform grading processing on the enterprise data to be processed to obtain a grading result of the enterprise.

The specific process of obtaining the scoring result of the enterprise by scoring the enterprise data to be processed by using the target model may refer to the first embodiment and the third embodiment.

Before the step of scoring the enterprise data to be processed by adopting the target subject evaluation model to obtain the scoring result of the enterprise, the method comprises the following steps:

step A01, carrying out model stability and accuracy evaluation on the target model through preset evaluation data, and determining whether the target model passes model evaluation;

and A02, if the target model passes the model evaluation, performing grading processing on the enterprise data to be processed by adopting the target model to obtain a grading result of the enterprise.

Before the enterprise data to be processed is evaluated by adopting the target model, the stability and the accuracy of the model are evaluated on the target model through preset evaluation data, and the specific process can refer to the second embodiment.

Compared with the problem that a unified assessment model is adopted for each enterprise, the problem that the scene is not consistent and the assessment is not accurate easily exists in practical application, the business scene and the business target are determined from the received assessment model building instruction; determining target data based on the service scene and the service target, and performing feature mining and feature selection on the target data to obtain target feature data; carrying out federal modeling with other second participants to obtain a target model based on the target characteristic data; and grading the enterprise data to be processed by adopting the target model to obtain a grading result of the enterprise. In this embodiment, a service scene and a service target are determined, then, for the specific service scene and the specific service target, feature mining and feature selection are performed, target feature data conforming to an actual service are accurately obtained, then, based on the target feature data, by executing a preset federal process, federal modeling with other second participants is accurately realized on the basis of mass data, and a target model is obtained, that is, an evaluation model is accurately obtained, and then, grading processing is performed on enterprise data to be processed, so that a grading result of an enterprise is obtained, and therefore, the evaluation accuracy of the enterprise with the actual service use scene is improved.

Referring to fig. 3, fig. 3 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present application.

As shown in fig. 3, the model building apparatus based on federal learning may include: a processor 1001, such as a CPU, a memory 1005, and a communication bus 1002. The communication bus 1002 is used for realizing connection communication between the processor 1001 and the memory 1005. The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a memory device separate from the processor 1001 described above.

Optionally, the federal learning based model building device may further include a rectangular user interface, a network interface, a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and the like. The rectangular user interface may comprise a Display screen (Display), an input sub-module such as a Keyboard (Keyboard), and the optional rectangular user interface may also comprise a standard wired interface, a wireless interface. The network interface may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface).

Those skilled in the art will appreciate that the federated learning-based model building apparatus architecture illustrated in FIG. 3 does not constitute a limitation on federated learning-based model building apparatuses, and may include more or fewer components than those illustrated, or some components in combination, or a different arrangement of components.

As shown in fig. 3, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, and a model building method program based on federal learning. The operating system is a program for managing and controlling hardware and software resources of the model building device based on the federal learning, and supports the operation of the model building method program based on the federal learning and other software and/or programs. The network communication module is used for realizing communication among components in the memory 1005 and communication with other hardware and software in the model building method system based on the federal learning.

In the model building apparatus based on federal learning shown in fig. 3, the processor 1001 is configured to execute a program of a model building method based on federal learning stored in the memory 1005, and implement the steps of any one of the above-mentioned model building methods based on federal learning.

The specific implementation manner of the model building device based on the federal learning in the application is basically the same as that of each embodiment of the model building method based on the federal learning, and is not described herein again.

The embodiment of the present application further provides a model building apparatus based on federal learning, which is applied to a first participant, and the model building apparatus based on federal learning includes:

Optionally, the first determining module includes:

the first determining unit is used for selecting the characteristic factors from a preset characteristic factor library based on the service scene and the service target;

the first selection unit is used for selecting target data matched with the attribute of the characteristic factor from preset self data and public data;

the first acquisition unit is used for carrying out data preprocessing on the target data to obtain preprocessed data;

and the second selection unit is used for performing self-owned feature mining and self-owned feature selection on the preprocessed data, and performing public feature mining and public feature selection on the preprocessed data to obtain target feature data.

Optionally, the apparatus further comprises:

the evaluation module is used for evaluating the stability and the accuracy of the target model through preset evaluation data and determining whether the target model passes model evaluation;

and the input module is used for inputting the enterprise data to be processed into the target model to obtain a grading result of the enterprise if the target model passes the model evaluation.

Optionally, the evaluation module comprises:

the input unit is used for inputting the to-be-processed enterprise data of the target enterprise into the target model to obtain the model score of the target enterprise if the target model passes the model evaluation;

and the second acquisition unit is used for acquiring the third party score of the target enterprise and combining the model score and the third party score to obtain the scoring result of the enterprise.

Optionally, the evaluation module comprises:

the second determining unit is used for carrying out model disturbance on the target model through disturbance data in the preset evaluation data and determining a first model error between a first prediction result obtained after disturbance and a first preset result;

a third determining unit, configured to determine that the target model passes stability evaluation if the first model error is smaller than a first preset value;

the fourth determining unit is used for carrying out model accuracy verification on the target model through accuracy evaluation data in preset evaluation data and determining a second model error between a second prediction result obtained after verification and a second preset result;

and the fifth determining unit is used for determining that the target model passes the accuracy evaluation if the error of the second model is smaller than a second preset value.

Optionally, the federation module includes:

the third acquisition unit is used for acquiring a preset model to be trained and acquiring a preset training completion condition of the preset model to be trained;

and the execution unit is used for optimizing and updating the model calculation intermediate variable of the preset model to be trained by combining other second participants based on the target characteristic data until the preset model to be trained meets a preset training completion condition, so as to obtain the target model.

Optionally, the execution unit includes:

the first training subunit is used for inputting the target characteristic data into a preset model to be trained, predicting the target characteristic data based on the preset model to be trained to obtain a third prediction result, determining a third model error between the third prediction result and a third preset result of the target characteristic data, updating the preset model to be trained based on the third model error, and iteratively updating a model calculation intermediate variable of the preset model to be trained;

the judging subunit is configured to judge whether the preset model to be trained of the iterative training meets a preset replacement update condition, and if the preset model to be trained meets the preset replacement update condition, perform replacement update on the model calculation intermediate variable updated by training by executing the preset longitudinal federal learning procedure to obtain the preset model to be trained, which is updated by replacement;

and the second training subunit is used for continuously carrying out iterative training and replacement updating on the preset model to be trained which is subjected to replacement updating until the preset model to be trained meets a preset training completion condition, so as to obtain the target model.

The specific implementation of the model building device based on federal learning in the present application is basically the same as that of each embodiment of the model building method based on federal learning, and is not described herein again.

The embodiment of the application provides a storage medium, and the storage medium stores one or more programs, which can be further executed by one or more processors for implementing the steps of any one of the above-mentioned model building methods based on federal learning.

The specific implementation of the storage medium of the present application is substantially the same as that of each embodiment of the model construction method based on federated learning, and is not described herein again.

The specific implementation of the computer program product of the present application is substantially the same as each embodiment of the above-described model building method based on federal learning, and is not described herein again.

The specific implementation of the target subject evaluation device based on federal learning in the present application is basically the same as the above-mentioned target subject evaluation method based on federal learning, and is not described herein again.

The embodiment of the application provides a storage medium, and the storage medium stores one or more programs, and the one or more programs can be further executed by one or more processors to implement the steps of any one of the target subject evaluation methods based on federal learning.

The specific implementation of the storage medium of the present application is substantially the same as each embodiment of the target subject evaluation method based on federal learning, and is not described herein again.

The present application also provides a computer program product, comprising a computer program, which when executed by a processor implements the steps of the above-described target subject evaluation method based on federal learning.

The specific implementation of the computer program product of the present application is substantially the same as each embodiment of the target subject evaluation method based on federal learning, and is not described herein again.

The above description is only a preferred embodiment of the present application, and not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the specification and the drawings, or which are directly or indirectly applied to other related technical fields, are included in the scope of the present application.

Claims

1. a model construction method based on federated learning, is characterized in that, is applied to the first participant, and described model construction method based on federated learning comprises:

When an evaluation model construction instruction is detected, a business scenario and a business objective are determined based on the evaluation model construction instruction;

Determine target data based on the business scenario and business goals, and perform feature mining and feature selection on the target data to obtain target feature data;

Based on the target feature data, a preset federation process is executed to achieve federated modeling with other second participants and obtain a target model.

2. The method for building a model based on federated learning according to claim 1, wherein the target data is determined based on the business scenario and business goals, and feature mining and feature selection are performed on the target data to obtain the target data. Steps to characterize data, including:

Based on the business scenario and business objective, selecting a feature factor from a preset feature factor library;

Select target data matching the attributes of the characteristic factor from preset own data and public data;

performing data preprocessing on the target data to obtain preprocessing data;

Perform self-feature mining and self-feature selection on the preprocessed data, and perform public feature mining and public feature selection on the preprocessed data to obtain target feature data.

3. The model construction method based on federated learning according to claim 1, wherein, based on the target feature data, by executing a preset federation process, to achieve federated modeling with other second participants and obtain the target Following the steps of the model, the method includes:

Carry out model stability and accuracy evaluation on the target model by preset evaluation data, and determine whether the target model passes the model evaluation;

If the target model passes the model evaluation, the enterprise data to be processed is input into the target model to obtain the enterprise's scoring result.

4. The model construction method based on federated learning according to claim 3, wherein, if the target model passes the model evaluation, the enterprise data to be processed is input into the target model to obtain the enterprise's data. Steps for scoring results, including:

If the target model passes the model evaluation, input the enterprise data of the target enterprise to be processed into the target model, and obtain the model score of the target enterprise;

Obtain the third-party score of the target enterprise, and combine the model score and the third-party score to obtain the enterprise's score result.

5. The model construction method based on federated learning as claimed in claim 3, wherein the target model is evaluated for model stability and accuracy through preset evaluation data, and it is determined whether the target model is not. Steps through model evaluation include:

Perform model perturbation on the target model by using the perturbation data in the preset evaluation data, and determine the first model error between the first prediction result obtained after the perturbation and the first preset result;

If the error of the first model is less than a first preset value, determining that the target model passes the evaluation of stability;

Perform model accuracy verification on the target model by using the accuracy evaluation data in the preset evaluation data, and determine the second model error between the second prediction result obtained after the verification and the second preset result;

If the error of the second model is less than a second preset value, it is determined that the target model passes the evaluation of accuracy.

6. The method for building a model based on federated learning according to claim 1, wherein, based on the target feature data, by executing a preset federation process, to achieve federated modeling with other second participants and obtain the target The steps of the model include:

obtaining a preset model to be trained, and obtaining a preset training completion condition of the preset model to be trained;

Based on the target feature data, optimize and update the model calculation intermediate variables of the preset to-be-trained model in conjunction with other second participants, until the preset to-be-trained model meets the preset training completion conditions, and the target is obtained Model.

7. The method for constructing a model based on federated learning according to claim 6, wherein, based on the target feature data, the model calculation intermediate variable of the preset model to be trained is performed in conjunction with other second participants. Optimizing and updating until the preset to-be-trained model meets preset training completion conditions, the steps of obtaining the target model include:

Inputting the target feature data into the preset model to be trained, to perform prediction processing on the target feature data based on the preset model to be trained, to obtain a third prediction result, and determining the third prediction result and the The third model error between the third preset results of the target feature data, the preset model to be trained is updated based on the third model error, so as to iteratively update the model of the preset model to be trained to calculate intermediate variables ;

It is judged whether the preset to-be-trained model of the iterative training meets the preset replacement and update condition, and if the preset to-be-trained model reaches the preset replacement and update condition, by executing the preset longitudinal federated learning process, the training update is performed. The model calculates an intermediate variable for replacement and update, and obtains the replacement and updated preset model to be trained;

Iterative training and replacement update are continuously performed on the preset model to be trained that is replaced and updated, until the preset model to be trained satisfies preset training completion conditions, and the target model is obtained.

8. A target subject evaluation method based on federated learning, characterized in that, applied to a first participant, the federated learning-based target subject evaluation method comprises:

Determine business scenarios and business objectives from the received evaluation model building instructions;

Based on the target feature data, federated modeling with other second parties to obtain a target model;

The target model is used to perform scoring processing on the data of the enterprise to be processed, and the scoring result of the enterprise is obtained.

9 . The target subject evaluation method based on federated learning according to claim 8 , wherein, before the step of using the target subject evaluation model to perform scoring processing on the data of the enterprise to be processed, and obtaining the scoring result of the enterprise, the Methods include:

If the target model passes the model evaluation, the step of using the target model to score the enterprise data to be processed to obtain a score result of the enterprise is performed.

10. The method for evaluating a target subject based on federated learning according to claim 8, wherein the target data is determined based on the business scenario and business target, and feature mining and feature selection are performed on the target data, The steps of obtaining target feature data include:

performing data preprocessing on the target data to obtain preprocessing data;

11. A model building device based on federated learning, characterized in that the model building device based on federated learning comprises: a memory, a processor, and a method for implementing the model building method based on federated learning stored on the memory. program,

The memory is used to store a program for implementing the federated learning-based model building method;

The processor is configured to execute a program for implementing the method for constructing a model based on federated learning, so as to implement the steps of the method for constructing a model based on federated learning according to any one of claims 1 to 10.

12. A readable storage medium, wherein a program for implementing a federated learning-based model building method is stored on the readable storage medium, and the program for implementing a federated learning-based model building method is executed by a processor to The steps of implementing the federated learning-based model building method according to any one of claims 1 to 10.

13. A computer program product, comprising a computer program, characterized in that, when the computer program is executed by a processor, the method of any one of claims 1 to 10 is implemented.