WO2021103909A1 - 风险预测和风险预测模型的训练方法、装置及电子设备 - Google Patents

风险预测和风险预测模型的训练方法、装置及电子设备 Download PDF

Info

Publication number
WO2021103909A1
WO2021103909A1 PCT/CN2020/124718 CN2020124718W WO2021103909A1 WO 2021103909 A1 WO2021103909 A1 WO 2021103909A1 CN 2020124718 W CN2020124718 W CN 2020124718W WO 2021103909 A1 WO2021103909 A1 WO 2021103909A1
Authority
WO
WIPO (PCT)
Prior art keywords
private data
user
target
users
prediction model
Prior art date
Application number
PCT/CN2020/124718
Other languages
English (en)
French (fr)
Inventor
陆梦倩
Original Assignee
支付宝(杭州)信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 支付宝(杭州)信息技术有限公司 filed Critical 支付宝(杭州)信息技术有限公司
Publication of WO2021103909A1 publication Critical patent/WO2021103909A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities

Definitions

  • This document relates to the field of computer software technology, in particular to a method, device and electronic equipment for risk prediction and risk prediction model training.
  • the purpose of the embodiments of this specification is to provide a method, device and electronic equipment for risk prediction and risk prediction model training, so as to improve the recognition accuracy of the risk prediction model.
  • a risk prediction method which includes: obtaining the first private data corresponding to the user identification of the target user from the user database of the enabling organization, and obtaining the data related to the target user from the user database of the target organization.
  • the private data of the users of the energy organization is obtained through isomorphic migration training; the first private data and the second private data are input into the second risk level prediction model, and the second risk level of the target user is predicted
  • the second risk level prediction model is obtained through longitudinal federated learning training based on the private data of the target group of users and the corresponding label, and the label corresponding to the private data of the target group of users is the first risk prediction of the target group of users.
  • the private data of the target group users includes the private data of the target group users in the enabling organization and the private data of the target group users in the target organization.
  • a method for training a risk prediction model which includes: obtaining private data of users of an enabling organization and private data of users of a target group, wherein the target group users are the enabling organization and the target organization
  • the private data of the target group users includes the private data of the target group users in the enabling organization, and the private data of the target group users in the target organization, and the users of the enabling organization Including the target group of users;
  • the first risk prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; based on the private data of the target group of users and the corresponding tags, through vertical federated learning training
  • a second risk prediction model is obtained, and the label corresponding to the private data of the target group of users is the fitting error corresponding to the target group of users in the first risk prediction model; wherein, the first risk prediction model and the second risk prediction The model is used to jointly identify the user's risk level.
  • a risk prediction device which includes: an acquiring unit that acquires the first private data corresponding to the user identification of the target user from the user database of the enabling organization, and acquires the data from the user database of the target organization.
  • a risk level prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; the second prediction unit inputs the first private data and the second private data into the second risk level prediction model , The second risk level of the target user is predicted, and the second risk level prediction model is obtained through longitudinal federated learning training based on the private data of the target group of users and the corresponding label, and the private data of the target group of users The label corresponding to the data is the fitting error of the target group of users in the first risk prediction model; the third prediction unit predicts the risk level of the target user based on the first risk level and the second risk level; wherein , The target group user is a common user of the enabling organization and the target organization, and the private data of the target group user includes the private data of the target group user in the enabling organization, and the target Private data of group users in the target organization.
  • a training device for a risk prediction model including: a data acquisition unit that acquires private data of users of an enabling organization and private data of users of a target group, wherein the user of the target group is the enabler A common user of an organization and a target organization, the private data of the target group user includes the private data of the target group user in the enabling organization, and the private data of the target group user in the target organization, the grant
  • the users of the energy organization include the users of the target group;
  • the first training unit is based on the private data of the users of the enabling organization to obtain the first risk prediction model through isomorphic migration training;
  • the second training unit is based on the target group
  • the user’s private data and corresponding labels are trained through longitudinal federated learning to obtain the second risk prediction model, and the label corresponding to the target group users’ private data is the fitting error corresponding to the target group users in the first risk prediction model; where,
  • the first risk prediction model and the second risk prediction model are used to jointly identify the risk level of the user.
  • an electronic device comprising: a processor; and a memory arranged to store computer-executable instructions, which when executed, cause the processor to perform the following operations: Acquire the first private data corresponding to the user ID of the target user from the user database of the enabling organization, and obtain the second private data corresponding to the user ID of the target user from the user database of the target organization; The private data is input into the first risk level prediction model, and the first risk level of the target user is predicted.
  • the first risk level prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization;
  • the first private data and the second private data are input into a second risk level prediction model, and the second risk level of the target user is predicted.
  • the second risk level prediction model is based on the target group of users Private data and corresponding labels are obtained through longitudinal federated learning training, and the label corresponding to the private data of the target group users is the fitting error corresponding to the target group users in the first risk prediction model; based on the first risk level and The second risk level predicts the risk level of the target user; wherein the target group user is a common user of the enabling organization and the target organization, and the private data of the target group user includes the target The private data of the group user in the enabling organization and the private data of the target group user in the target organization.
  • a computer-readable storage medium stores one or more programs that, when executed by an electronic device including multiple application programs, cause all The electronic device performs the following operations: acquiring the first private data corresponding to the user identification of the target user from the user database of the enabling organization, and acquiring the second private data corresponding to the user identification of the target user from the user database of the target organization Private data; input the first private data into a first risk level prediction model to predict the first risk level of the target user, and the first risk level prediction model is based on the private data of the user of the enabling organization Obtained through isomorphic migration training; the first private data and the second private data are input into a second risk level prediction model, and the second risk level of the target user is predicted, and the second risk level
  • the prediction model is obtained through longitudinal federated learning training based on the private data of the target group of users and the corresponding label, and the label corresponding to the private data of the target group of users is the fitting error corresponding to the target group of users
  • an electronic device including: a processor; and a memory arranged to store computer-executable instructions, which when executed, cause the processor to perform the following operations:
  • the model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; the private data corresponding to the user identification of the target user is input into the second risk level prediction model, and the second risk level prediction model of the target user is obtained.
  • the second risk level prediction model is obtained through longitudinal federated learning training based on the private data of the target group user and the corresponding label, and the label corresponding to the private data of the target group user is that the target group user is in the first A fitting error corresponding to a risk prediction model; based on the first risk level and the second risk level, predict the risk level of the target user; wherein the target group user is the enabling organization and the A common user of a target organization, and the private data of the target group user includes the private data of the target group user in the enabling organization and the private data of the target group user in the target organization.
  • a computer-readable storage medium stores one or more programs that, when executed by an electronic device including multiple application programs, cause all The electronic device performs the following operations: acquiring private data corresponding to the user identification of the target user; inputting the private data corresponding to the user identification of the target user into the first risk level prediction model, and predicting the first risk level of the target user A risk level, the first risk level prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; the private data corresponding to the user identification of the target user is input into the second risk level prediction In the model, the second risk level of the target user is predicted, and the second risk level prediction model is based on the private data and corresponding tags of the target group of users, and is obtained through longitudinal federated learning training.
  • the label corresponding to the private data is the fitting error of the target group of users in the first risk prediction model; based on the first risk level and the second risk level, the risk level of the target user is predicted; wherein, the target A group user is a common user of the enabling organization and the target organization, and the private data of the target group user includes the private data of the target group user in the enabling organization and the target group user’s private data in the target organization. Private data in the organization.
  • one or more embodiments provided in this specification can pass isomorphism based on the private data of the users of the enabling organization
  • the first risk prediction model obtained by migration training can predict the risk level of the target user; and based on the private data and corresponding labels of the target group users shared by the enabling organization and the target organization, the second risk can be obtained through longitudinal federated learning training
  • the prediction model makes a secondary prediction of the target user's risk level, and combines the two prediction results to determine the target user's risk level.
  • the first risk prediction model and the second risk prediction model are trained to jointly evaluate the risk level of the target user. Forecasting improves the accuracy of risk level forecasting.
  • Fig. 1 is a schematic diagram of an implementation process of a risk prediction method provided by an embodiment of this specification.
  • Fig. 2 is a schematic diagram of applying the risk prediction method provided by an embodiment of this specification in an actual scenario.
  • Fig. 3 is a schematic flowchart of a method for training a risk prediction model provided by an embodiment of this specification.
  • Fig. 4a is a schematic diagram of a model training process in a risk prediction model training method provided in an embodiment of this specification.
  • Fig. 4b is a schematic diagram of a model training process in a risk prediction model training method provided in another embodiment of this specification.
  • Fig. 5 is a schematic structural diagram of a risk prediction device provided by an embodiment of this specification.
  • Fig. 6 is a schematic structural diagram of a training device for a risk prediction model provided by an embodiment of this specification.
  • Fig. 7 is a schematic structural diagram of an electronic device provided by an embodiment of this specification.
  • Fig. 8 is a schematic structural diagram of another electronic device provided by an embodiment of this specification.
  • one or more embodiments of this specification provide a risk prediction method, which can be based on the first risk prediction model obtained through isomorphic migration training based on the private data of the users of the enabling organization, and the target Predict the user’s risk level; and based on the private data and corresponding labels of the target group users shared by the enabling organization and the target organization, the second risk prediction model can be obtained through longitudinal federated learning training, and the target user’s risk level can be secondarily performed Forecast, and combine the results of these two forecasts to determine the risk level of the target user.
  • the first risk prediction model and the second risk prediction model are trained to jointly evaluate the risk level of the target user.
  • the second risk prediction model is trained with the fitting error of the first risk prediction model as the target.
  • the prediction results of the first risk prediction model and the second risk prediction model are integrated, which greatly improves the risk to users The level of prediction accuracy.
  • the execution subject of the risk prediction method provided in the embodiment of this specification may, but is not limited to, a server or the like that can be configured to execute at least one of the method and apparatus provided in the embodiment of this specification.
  • the implementation of the method is introduced below by taking a server capable of executing the method as an example. It can be understood that the fact that the execution subject of the method is the server is only an exemplary description, and should not be understood as a limitation of the method.
  • Fig. 1 is a schematic diagram of an implementation process of a risk prediction method provided by an embodiment of this specification.
  • the method of FIG. 1 may include step S110 to step S140.
  • S110 Obtain the first private data corresponding to the user identification of the target user from the user database of the enabling organization, and acquire the second private data corresponding to the user identification of the target user from the user database of the target organization.
  • the first risk prediction model and the The second risk prediction model is for the target organization, and there are often some common users between the target organization and the enabling organization.
  • the first private data and the second private data may specifically include the transaction data information of the target user, the identity data information of the target user, the account data information of the target user, the registration data information of the target user, the occupation, age, and income of the target user. and many more.
  • the first risk level prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization .
  • the first risk level may specifically be a risk score, and the value range may be [0,1].
  • the second risk level prediction model is based on the private data of the target group of users and the corresponding label , Obtained through longitudinal federated learning training, the label corresponding to the private data of the target group of users is the fitting error corresponding to the target group of users in the first risk prediction model.
  • the target group users are the common users of the enabling organization and the target organization
  • the private data of the target group users includes the private data of the target group users in the enabling organization and the private data of the target group users in the target organization.
  • the second risk level prediction model may be specifically trained with the fitting error of the first risk level prediction model as the prediction target during training.
  • the fitting error error of the first risk level prediction model true value Y-predicted value Y1.
  • S140 Predict the risk level of the target user based on the first risk level and the second risk level.
  • determining the risk level of the target user includes:
  • the risk level of the target user is determined.
  • the addition model is formed by the addition of multiple base models.
  • the prediction target of the first risk prediction model is f1(x)
  • the prediction target of the second risk prediction model is Y-f1(x)
  • Y is the true value
  • the first risk prediction model is obtained through isomorphic migration training, where the users of the enabling organization include users of some target organizations.
  • the private data of the users of the enabling organization described herein may specifically be the private data of all users of the enabling organization.
  • S22 Obtain the first private data corresponding to the user ID of the target user from the user database of the enabling organization, and input the first private data into the first risk prediction model, so as to compare the target user with the first risk prediction model.
  • the risk level of the target user is predicted, and the predicted value Y1 of the first risk level of the target user is output.
  • the target group users are the common users of the enabling organization and the target organization
  • the private data of the target group users includes the private data of the target group users in the enabling organization and the private data of the target group users in the target organization.
  • the second risk prediction model is trained by using the fitting error error of the first risk prediction model as the prediction target.
  • S26 Obtain the risk level of the target user based on the additive model, and output a predicted value Y1+Y2 of the risk level of the target user.
  • the one or more embodiments provided in this specification can predict the risk level of the target user based on the first risk prediction model obtained through isomorphic migration training based on the private data of the users of the enabling organization; and can be based on the enabling organization
  • the private data and corresponding labels of the target group users shared with the target organization are obtained through longitudinal federated learning training to obtain the second risk prediction model, the risk level of the target user is secondarily predicted, and the two prediction results are combined to determine the target user The level of risk. Since the private data of the users of the enabling organization and the private data of the target group users shared by the enabling organization and the target organization are fully utilized, the first risk prediction model and the second risk prediction model are trained to jointly evaluate the risk level of the target user. Forecasting improves the accuracy of risk level forecasting.
  • Fig. 3 is a schematic diagram of an implementation process of a method for training a risk prediction model provided by an embodiment of the present specification, including step S310 to step S330.
  • S310 Obtain private data of users of the enabling organization and private data of the target group users, where the target group users are the common users of the enabling organization and the target organization, and the private data of the target group users includes the target group users in the enabling organization The private data of the target group and the private data of the target group users in the target organization.
  • the users of the empowering organization include the target group users.
  • the enabling organization expects to jointly use the private data of the enabling organization’s users and the private data of the target organization under the premise of protecting the private data of the enabling organization itself and the target organization to jointly complete the first risk prediction model and the second risk prediction model.
  • Training of risk prediction models Based on this, the embodiment of this specification adopts the model training method of isomorphic migration and longitudinal federated learning to obtain the first risk prediction model and the second risk prediction model respectively, and combines the first risk prediction model and the second risk prediction model. Predict the user's risk level.
  • the isomorphic migration only needs to use the private data of all users of the enabling organization itself, and combining the private data of the target group users in the enabling organization, which is shared by the enabling organization and the target organization, a model adaptation is performed on the target organization.
  • the private data and the private data of the target group users that can be provided by the target institution improve the accuracy of risk prediction.
  • the gray area is all the user IDs owned by the enabling organization and the corresponding private data (ie the source domain + target domain as shown in the figure).
  • the private data of the user contained in the target domain is the enabling organization and the target.
  • the private data of the co-users of the organization in the enabling organization, that is, the private data of the users contained in the target domain is the part of the data that overlaps the enabling organization and the target organization.
  • the first risk prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization.
  • the information between the enabling institution and the target institution should be obtained first.
  • Common users refer to the private data of the target group users in the enabling organization.
  • the first risk prediction model is obtained through isomorphic migration training, including: obtaining the private data of the target group users in the enabling organization; and based on the private data and grants of the users of the enabling organization
  • the private data of the target group users mentioned in the energy organization is trained to obtain the first risk prediction model through isomorphic migration training.
  • a schematic diagram of the first risk prediction model obtained through isomorphic migration training provided in the embodiment of this specification.
  • the specific process includes: First, use the private data of the user contained in the source domain to obtain a neural network model.
  • Specific training methods The embodiment of this specification does not specifically limit this; then, for each layer of the model network, calculate the mean ⁇ 1 and standard deviation ⁇ 1 of the user’s private data contained in the source domain at this layer, and the target domain contains The average value ⁇ 2 and standard deviation ⁇ 2 of the user’s private data output in this layer; then the model obtained by training is used to predict the private data of the user contained in the target domain to obtain the predicted value U, and then perform a unified data distribution on the predicted value , The predicted value [(U- ⁇ 2)/ ⁇ 2]* ⁇ 1+ ⁇ 1 is obtained, so as to unify the range of the prediction result of the first risk prediction model obtained by training on the private data of the user in the source domain and the target domain.
  • the second risk prediction model is obtained through longitudinal federated learning training, and the label corresponding to the private data of the target group of users is the fitting error corresponding to the target group of users in the first risk prediction model .
  • the first risk prediction model and the second risk prediction model are used to jointly identify the risk level of the user.
  • one or more embodiments of this specification may also obtain a second risk prediction model through longitudinal federated learning training.
  • the second risk prediction model is obtained through longitudinal federated learning training, including: obtaining the predicted value of the test data based on the first risk prediction model and the true value corresponding to the test data Fitting error of the first risk prediction model; Based on the private data of the target group of users, the second risk prediction model is obtained through longitudinal federated learning training until the predicted value of the second risk prediction model approaches the fitting error of the first risk prediction model.
  • Fig. 4b a schematic diagram of the second risk prediction model obtained through longitudinal federated learning training provided in the embodiment of this specification.
  • the second risk prediction model is specifically based on the common users of the enabling organization and the target organization, that is, the private data of the target group users in the enabling organization and the private data of the target group users in the target organization, so that the enabling organization cannot know or Reverse the data of the target organization, and the target organization cannot obtain or reverse the data of the enabling organization, obtained through longitudinal federated learning and training.
  • the one or more embodiments provided in this specification can predict the user’s risk level based on the first risk prediction model obtained through isomorphic migration training based on the private data of the users of the enabling organization; and can be based on the enabling organization and The private data and corresponding labels of the target group users shared by the target organization are obtained through longitudinal federated learning training to obtain the second risk prediction model, the user's risk level is predicted twice, and the two prediction results are combined to determine the user's risk level . Since the private data of the users of the enabling organization and the private data of the target group users shared by the enabling organization and the target organization are fully utilized, the first risk prediction model and the second risk prediction model are trained to jointly predict the user's risk level , Improve the accuracy of risk level prediction.
  • FIG. 5 is a schematic structural diagram of a risk prediction device 500 provided by an embodiment of this specification.
  • the risk prediction apparatus 500 may include the following units.
  • the acquiring unit 501 acquires the first private data corresponding to the user identification of the target user from the user database of the enabling organization, and acquires the second private data corresponding to the user identification of the target user from the user database of the target organization.
  • the first prediction unit 502 inputs the first private data into a first risk level prediction model to predict a first risk level of the target user, and the first risk level prediction model is a user based on an enabling organization
  • the private data is obtained through isomorphic transfer training.
  • the second prediction unit 503 inputs the first private data and the second private data into a second risk level prediction model to predict the second risk level of the target user, and the second risk level prediction model It is obtained through longitudinal federated learning training based on the private data of the target group of users and the corresponding label, and the label corresponding to the private data of the target group of users is the fitting error corresponding to the target group of users in the first risk prediction model.
  • the third prediction unit 504 predicts the risk level of the target user based on the first risk level and the second risk level.
  • the target group user is a common user of the enabling organization and the target organization
  • the private data of the target group user includes the private data of the target group user in the enabling organization, and the The private data of the target group users in the target organization.
  • the third prediction unit 504 is configured to determine the risk level of the target user based on the first risk level and the second risk level through an additive model.
  • the risk prediction apparatus 500 can implement the methods of the method embodiments shown in FIGS. 1 to 2. For details, reference may be made to the risk prediction method of the embodiments shown in FIGS. 1 to 2, and details are not described herein again.
  • FIG. 6 is a schematic structural diagram of a training device 600 for a risk prediction model provided by an embodiment of this specification. Please refer to FIG. 6, in a software implementation, the training device 600 for a risk prediction model may include the following units.
  • the data acquiring unit 601 acquires private data of users of the enabling organization and private data of users of a target group, where the target group users are common users of the enabling organization and the target organization, and the private data of the users of the target group Including the private data of the target group users in the enabling organization and the private data of the target group users in the target organization, and the users of the enabling organization include the target group users.
  • the first training unit 602 obtains the first risk prediction model through isomorphic migration training based on the private data of the users of the enabling organization.
  • the second training unit 603 obtains a second risk prediction model through longitudinal federated learning training based on the private data of the target group users and the corresponding labels, and the label corresponding to the private data of the target group users is that the target group users are in the first The fitting error corresponding to the risk prediction model.
  • the first risk prediction model and the second risk prediction model are used to jointly identify the risk level of the user.
  • the first training unit 602 is configured to: obtain private data of the target group users in the enabling organization; and based on the private data of the users of the enabling organization The private data of the target group users in the enabling organization obtains the first risk prediction model through isomorphic migration training.
  • the second training unit 603 is configured to: obtain the first risk prediction model based on the predicted value of the first risk prediction model on the test data and the true value corresponding to the test data 1. Fitting error of the risk prediction model; based on the private data of the target group of users, a second risk prediction model is obtained through longitudinal federated learning training until the predicted value of the second risk prediction model approaches the first risk prediction model The fitting error.
  • the risk prediction model training device 600 can implement the methods of the method embodiments in FIGS. 3, 4a, and 4b. For details, please refer to the training method of the risk prediction model in the embodiments shown in FIGS. 3, 4a, and 4b, which will not be repeated here. .
  • Fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Please refer to FIG. 7.
  • the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory.
  • the memory may include memory, such as high-speed random access memory (Random-Access Memory, RAM), and may also include non-volatile memory (non-volatile memory), such as at least one disk storage.
  • RAM random access memory
  • non-volatile memory such as at least one disk storage.
  • the electronic device may also include hardware required by other services.
  • the processor, network interface, and memory can be connected to each other through an internal bus.
  • the internal bus can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus, or an EISA (Extended) bus. Industry Standard Architecture, extended industry standard structure) bus, etc.
  • the bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one bidirectional arrow is used in FIG. 7, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, and the program code includes computer operation instructions.
  • the memory may include memory and non-volatile memory, and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory to the memory and then runs it, forming a risk prediction device on a logical level.
  • the processor executes the program stored in the memory, and is specifically configured to perform the following operations: obtain the first private data corresponding to the user ID of the target user from the user database of the enabling organization, and obtain the data from the user database of the target organization
  • the second private data corresponding to the user identification of the target user the first private data is input into a first risk level prediction model, and the first risk level of the target user is predicted, and the first risk level is predicted
  • the model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; the first private data and the second private data are input into the second risk level prediction model to predict the target user’s
  • the second risk level, the second risk level prediction model is obtained through longitudinal federated learning training based on the private data of the target group of users and the corresponding label, and the label corresponding to the private data of the target
  • the method performed by the risk prediction apparatus disclosed in the embodiments shown in FIGS. 1 to 2 of this specification can be applied to a processor or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method can be completed by an integrated logic circuit of hardware in the processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (DSP), a dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of this specification can be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the electronic device can also execute the methods shown in FIGS. 1 to 2 and realize the functions of the embodiments of the risk prediction device shown in FIGS. 1 to 2, which are not repeated here in the embodiments of this specification.
  • the embodiment of the present specification also proposes a computer-readable storage medium that stores one or more programs, the one or more programs include instructions, and the instructions are used in a portable electronic device that includes multiple application programs.
  • the portable electronic device can execute the method of the embodiment shown in Figs. 1 to 2, and is specifically used to perform the following operations: Obtain the first private data corresponding to the user ID of the target user from the user database of the enabling organization.
  • the first risk level, the first risk level prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; the first private data and the second private data are input to the second risk In the level prediction model, the second risk level of the target user is predicted, and the second risk level prediction model is based on the private data of the target group users and the corresponding tags, and is obtained through longitudinal federated learning training.
  • the target group The label corresponding to the user’s private data is the fitting error of the target group of users in the first risk prediction model; based on the first risk level and the second risk level, the risk level of the target user is predicted; where The target group user is a common user of the enabling organization and the target organization, and the private data of the target group user includes the private data of the target group user in the enabling organization, and the target group user Private data in the target organization.
  • the electronic equipment in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, etc. That is to say, the execution body of the following processing flow is not limited to each logic unit. It can also be a hardware or logic device.
  • Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Please refer to FIG. 8.
  • the electronic device includes a processor, and optionally an internal bus, a network interface, and a memory.
  • the memory may include memory, such as high-speed random access memory (Random-Access Memory, RAM), and may also include non-volatile memory (non-volatile memory), such as at least one disk storage.
  • RAM random access memory
  • non-volatile memory such as at least one disk storage.
  • the electronic device may also include hardware required by other services.
  • the processor, network interface, and memory can be connected to each other through an internal bus.
  • the internal bus can be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus, or an EISA (Extended) bus. Industry Standard Architecture, extended industry standard structure) bus, etc.
  • the bus can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one bidirectional arrow is used in FIG. 8, but it does not mean that there is only one bus or one type of bus.
  • the program may include program code, and the program code includes computer operation instructions.
  • the memory may include memory and non-volatile memory, and provide instructions and data to the processor.
  • the processor reads the corresponding computer program from the non-volatile memory to the memory and then runs it to form a training device for the risk prediction model on the logical level.
  • the processor executes the program stored in the memory, and is specifically configured to perform the following operations: obtain the private data of the users of the enabling organization and the private data of the target group users, where the target group users are the enabling organization and the target
  • the private data of the target group users includes the private data of the target group users in the enabling organization and the private data of the target group users in the target organization.
  • Users include the target group users; the first risk prediction model is obtained through isomorphic migration training based on the private data of the users of the enabling organization; based on the private data of the target group users and the corresponding tags, through vertical federated learning
  • the second risk prediction model is obtained by training, and the label corresponding to the private data of the target group user is the fitting error corresponding to the target group user in the first risk prediction model; wherein, the first risk prediction model and the second risk
  • the predictive model is used to jointly identify the user's risk level.
  • the method performed by the risk prediction model training device disclosed in the embodiments shown in FIG. 3, FIG. 4a, and FIG. 4b of this specification can be applied to a processor or implemented by a processor.
  • the processor may be an integrated circuit chip with signal processing capabilities.
  • each step of the above method can be completed by an integrated logic circuit of hardware in the processor or instructions in the form of software.
  • the above-mentioned processor may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP), etc.; it may also be a digital signal processor (DSP), a dedicated integrated Circuit (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of this specification can be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
  • the electronic device can also execute the methods of Figure 3, Figure 4a and Figure 4b, and implement the functions of the embodiment shown in Figure 3, Figure 4a, and Figure 4b of the training device for the risk prediction model, which will not be repeated here. .
  • the embodiment of the present specification also proposes a computer-readable storage medium that stores one or more programs, the one or more programs include instructions, and the instructions are used in a portable electronic device that includes multiple application programs.
  • the portable electronic device can be made to execute the method of the embodiment shown in Fig. 3, Fig. 4a and Fig.
  • the target group user is a common user of the enabling organization and the target organization, and the private data of the target group user includes the private data of the target group user in the enabling organization, and the target group
  • a typical implementation device is a computer.
  • the computer may be, for example, a personal computer, a laptop computer, a cell phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or Any combination of these devices.
  • Computer-readable media include permanent and non-permanent, removable and non-removable media, and information storage can be realized by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

Abstract

一种风险预测和风险预测模型的训练方法、装置及电子设备,该方法包括:获取与目标用户的用户标识对应的第一私有数据和第二私有数据(S110);将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级(S120);将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级(S130),所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级(S140)。

Description

风险预测和风险预测模型的训练方法、装置及电子设备 技术领域
本文件涉及计算机软件技术领域,尤其涉及一种风险预测和风险预测模型的训练方法、装置及电子设备。
背景技术
目前,具备完善的风险识别能力的企业在对外赋能时,往往希望能够帮助银行、独立软件开发商(ISV,Independent Software Vendors)等其他机构,准确识别用户或商户在风险场景中的风险等级。现有的方案,往往是使用上述具备完善的风险识别能力的企业的大数据,针对其所有用户和商户训练得到一个风险预测模型,并将该风险预测模型针对其他机构的用户或商户的识别结果输出给对应的机构。
然而,上述方式在识别其他机构的用户或商户的风险等级时,由于缺乏其他机构本身的个性化数据,可能会出现识别准确率较低的情况。因此,如何充分利用一些已有数据,训练得到识别准确率较高的风险预测模型,仍然需要进一步的解决方案。
发明内容
本说明书实施例的目的是提供一种风险预测和风险预测模型的训练方法、装置及电子设备,以提高风险预测模型的识别准确率。
为解决上述技术问题,本说明书实施例是通过以下方面实现的。
第一方面,提出了一种风险预测方法,包括:从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用 户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
第二方面,提出了一种风险预测模型的训练方法,包括:获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
第三方面,提出了一种风险预测装置,包括:获取单元,从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;第一预测单元,将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;第二预测单元,将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;第三预测单元,基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
第四方面,提出了一种风险预测模型的训练装置,包括:数据获取单元,获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;第一训练单元,基于所述赋能机构的用户的私有数 据,通过同构迁移训练得到第一风险预测模型;第二训练单元,基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
第五方面,提出了一种电子设备,该电子设备包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
第六方面,提出了一种计算机可读存储介质,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机 构中的私有数据。
第七方面,提出了一种电子设备,包括:处理器;以及被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:获取与目标用户的用户标识对应的私有数据;将与所述目标用户的用户标识对应的私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将与所述目标用户的用户标识对应的私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
第八方面,提出了一种计算机可读存储介质,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:获取与目标用户的用户标识对应的私有数据;将与所述目标用户的用户标识对应的私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将与所述目标用户的用户标识对应的私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
由以上本说明书实施例提供的技术方案可见,本说明书实施例方案至少具备如下一种技术效果:本说明书提供的一种或多个实施例,能够基于赋能机构的用户的私有数据通过同构迁移训练得到的第一风险预测模型,对目标用户的风险等级进行预测;并能够基于赋能机构和目标机构共同的目标群体用户的私有数据及对应的标签,通过纵向联邦 学习训练得到第二风险预测模型,对目标用户的风险等级进行二次预测,并结合这两次预测结果来确定目标用户的风险等级。由于充分利用了赋能机构的用户的私有数据、以及赋能机构和目标机构共同的目标群体用户的私有数据,训练得到第一风险预测模型和第二风险预测模型联合对目标用户的风险等级进行预测,提高了风险等级的预测准确率。
附图说明
为了更清楚地说明本说明书实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本说明书中记载的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本说明书的一个实施例提供的一种风险预测方法的实施流程示意图。
图2是本说明书的一个实施例提供的风险预测方法应用在一种实际场景中的示意图。
图3是本说明书的一个实施例提供的一种风险预测模型的训练方法的流程示意图。
图4a是本说明书的一个实施例提供的风险预测模型的训练方法中的一种模型的训练过程示意图。
图4b是本说明书的另一个实施例提供的风险预测模型的训练方法中的一种模型的训练过程示意图。
图5是本说明书的一个实施例提供的一种风险预测装置的结构示意图。
图6是本说明书的一个实施例提供的一种风险预测模型的训练装置的结构示意图。
图7是本说明书的一个实施例提供的一种电子设备的结构示意图。
图8是本说明书的一个实施例提供的另一种电子设备的结构示意图。
具体实施方式
为使本说明书的目的、技术方案和优点更加清楚,下面将结合本说明书具体实施例及相应的附图对本说明书中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本文件一部分实施例,而不是全部的实施例。基于本文件中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本文件保护的范围。
以下结合附图,详细说明本说明书各实施例提供的技术方案。
为提高风险预测模型的识别准确率,本说明书一个或多个实施例提供一种风险预测方法,能够基于赋能机构的用户的私有数据通过同构迁移训练得到的第一风险预测模型,对目标用户的风险等级进行预测;并能够基于赋能机构和目标机构共同的目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,对目标用户的风险等级进行二次预测,并结合这两次预测结果来确定目标用户的风险等级。
由于充分利用了赋能机构的用户的私有数据、以及赋能机构和目标机构共同的目标群体用户的私有数据,训练得到第一风险预测模型和第二风险预测模型联合对目标用户的风险等级进行预测,且第二风险预测模型在训练时是以第一风险预测模型的拟合误差为目标的,最后综合第一风险预测模型和第二风险预测模型的预测结果,极大地提高了对用户风险等级的预测准确率。
应理解,本说明书实施例提供的风险预测方法的执行主体,可以但不限于服务器等能够被配置为执行本说明书实施例提供的该方法装置中的至少一种。
为便于描述,下文以该方法的执行主体为能够执行该方法的服务器为例,对该方法的实施方式进行介绍。可以理解,该方法的执行主体为服务器只是一种示例性的说明,并不应理解为对该方法的限定。
图1是本说明书的一个实施例提供的一种风险预测方法的实施流程示意图。图1的方法可包括步骤S110至步骤S140。
S110,从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与目标用户的用户标识对应的第二私有数据。
应理解,赋能机构与目标机构之间可以存在直接的合作关系,也可以存在间接的合作关系(即可以是通过中间机构建立的合作关系),且后文所述的第一风险预测模型和第二风险预测模型均是为目标机构服务的,而由于目标机构和赋能机构之间往往存在一些共同的用户。在对这些用户的风险等级进行预测时,则可以基于这些用户的用户标识,从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与目标用户的用户标识对应的第二私有数据。
其中,第一私有数据和第二私有数据具体可包括目标用户的交易数据信息、目标用户的身份数据信息、目标用户的账号数据信息、目标用户的注册数据信息、目标用户的 职业、年龄、收入等等。
S120,将第一私有数据输入到第一风险等级预测模型中,预测得到目标用户的第一风险等级,第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的。
该第一风险等级具体可以是一个风险分值,取值范围可以是[0,1]。
S130,将第一私有数据和第二私有数据输入到第二风险等级预测模型中,预测得到目标用户的第二风险等级,第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差。
其中,目标群体用户为赋能机构和目标机构的共同用户,目标群体用户的私有数据包括目标群体用户在赋能机构中的私有数据、和目标群体用户在目标机构中的私有数据。
需要说明的是,第二风险等级预测模型在训练时具体可以是以第一风险等级预测模型的拟合误差为预测目标训练得到的。其中,第一风险等级预测模型的拟合误差error=真实值Y-预测值Y1。
S140,基于第一风险等级和第二风险等级,预测目标用户的风险等级。
可选地,为了更好地融合第一风险预测模型和第二风险预测模型的模型预测结果,本说明书一个或多个实施例可以通过加法模型来融合两者的预测结果。具体地,基于第一风险等级和第二风险等级,确定目标用户的风险等级,包括:
通过加法模型基于第一风险等级和第二风险等级,确定目标用户的风险等级。
其中,加法模型由多个基模型相加而成,在本说明书实施例中,该加法模型由第一风险预测模型和第二风险预测模型相加而成,即预测值F(x)=f1(x)+f2(x),其中,f1(x)为第一风险预测模型的预测结果即第一风险等级,f2(x)为第二风险预测模型的预测结果即第二风险等级。具体来说,假设第一风险预测模型的预测目标是f1(x),那么第二风险预测模型的预测目标是Y-f1(x),Y为真实值,那么通过加法模型得到的预测值则是F(x)=f1(x)+f2(x)=Y,即预测值的范围还是[0,1]。
下面结合图2所示的风险预测方法应用在一种实际场景中的示意图,对风险预测方法的实施过程进行详细描述,包括步骤S21至步骤S26。
S21,基于赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型, 其中,赋能机构的用户包括一些目标机构的用户。
应理解,为了充分利用赋能机构的用户的私有数据,这里所述的赋能机构的用户的私有数据具体可以是该赋能机构的全量用户的私有数据。
S22,从赋能机构的用户数据库中获取与目标用户的用户ID相对应的第一私有数据,并将第一私有数据输入到第一风险预测模型中,以通过第一风险预测模型对目标用户的风险等级进行预测,输出目标用户的第一风险等级的预测值Y1。
S23,获取第一风险预测模型的拟合误差error,即目标用户的真实风险等级值Y与Y1的差值error=Y-Y1。
S24,基于目标群体用户的私有数据,并以第一风险预测模型的拟合误差error为对应的标签,通过纵向联邦学习训练得到第二风险预测模型。
其中,目标群体用户为赋能机构和目标机构的共同用户,目标群体用户的私有数据包括目标群体用户在赋能机构中的私有数据、和目标群体用户在目标机构中的私有数据。且第二风险预测模型是以第一风险预测模型的拟合误差error为预测目标训练得到的。
S25,从目标机构的用户数据库中获取与目标用户的ID相对应的第二私有数据,并将S22获取的第一私有数据和该第二私有数据输入到第二风险预测模型中,以通过第二风险预测模型对目标用户的风险等级进行预测,输出目标用户的第二风险等级的预测值Y2。
S26,基于加法模型得到目标用户的风险等级,输出得到目标用户的风险等级的预测值Y1+Y2。
本说明书提供的一种或多个实施例,能够基于赋能机构的用户的私有数据通过同构迁移训练得到的第一风险预测模型,对目标用户的风险等级进行预测;并能够基于赋能机构和目标机构共同的目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,对目标用户的风险等级进行二次预测,并结合这两次预测结果来确定目标用户的风险等级。由于充分利用了赋能机构的用户的私有数据、以及赋能机构和目标机构共同的目标群体用户的私有数据,训练得到第一风险预测模型和第二风险预测模型联合对目标用户的风险等级进行预测,提高了风险等级的预测准确率。
图3是本说明书的一个实施例提供的一种风险预测模型的训练方法的实施流程示意图,包括步骤S310至步骤S330。
S310,获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,目标群体用户为赋能机构和目标机构的共同用户,目标群体用户的私有数据包括目标群体用户在赋能机构中的私有数据、和目标群体用户在目标机构中的私有数据,赋能机构的用户包括目标群体用户。
其中,赋能机构期望通过在保护赋能机构本身和目标机构的私有数据的前提下,联合使用赋能机构的用户的私有数据和目标机构的私有数据,共同完成第一风险预测模型和第二风险预测模型的训练。基于此,本说明书实施例采用同构迁移和纵向联邦学习的模型训练方式,分别训练得到第一风险预测模型和第二风险预测模型,并将第一风险预测模型和第二风险预测模型联合起来预测用户的风险等级。
由于同构迁移只需使用赋能机构本身所有的用户的私有数据,并结合赋能机构和目标机构共同的目标群体用户在赋能机构中的私有数据,对目标机构进行了一次模型适配,得到第一风险预测模型;再通过纵向联邦学习使用目标群体用户在赋能机构中的私有数据和目标机构中的私有数据,训练得到第二风险预测模型,充分利用了赋能机构所有的用户的私有数据和目标机构可提供的目标群体用户的私有数据,提高了风险预测的准确率。
如图4a和图4b所示,为本说明书实施例提供的通过同构迁移和纵向联邦学习进行模型训练的示意图。在图4a中,灰色区域为赋能机构拥有的所有用户ID,以及对应的私有数据(即图示的源域+目标域部分),目标域中包含的用户的私有数据为赋能机构和目标机构的共同用户在赋能机构中的私有数据,也就是说,目标域中包含的用户的私有数据为赋能机构和目标机构重叠的那部分数据。
S320,基于赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型。
应理解,为了训练得到适配于目标机构的第一风险预测模型,本说明书一个或多个实施例在通过同构迁移训练得到第一风险预测模型时,应首先获取赋能机构与目标机构的共同用户即目标群体用户在赋能机构中的私有数据。具体地,基于赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型,包括:获取赋能机构中目标群体用户的私有数据;基于赋能机构的用户的私有数据和赋能机构中所述目标群体用户的私有数据,通过同构迁移训练得到第一风险预测模型。
如图4a所示,为本说明书实施例提供的通过同构迁移训练得到第一风险预测模型的示意图,具体过程包括:首先,用源域中包含的用户的私有数据训练得到神经网络类模 型,具体训练方式本说明书实施例对此不作具体限定;然后,对模型网络的每一层,计算源域中包含的用户的私有数据在此层输出的均值μ1和标准差σ1,以及目标域中包含的用户的私有数据在此层输出的均值μ2和标准差σ2;再通过训练得到的模型对目标域中包含的用户的私有数据进行预测,得到预测值U,再对该预测值进行统一数据分布,得到预测值[(U-μ2)/σ2]*σ1+μ1,从而统一训练得到的第一风险预测模型对源域和目标域中的用户的私有数据的预测结果的范围。
S330,基于目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差。
其中,第一风险预测模型和第二风险预测模型用于联合识别用户的风险等级。
应理解,为了提高风险预测模型的预测准确率,从而更好地为目标机构服务,本说明书一个或多个实施例还可通过纵向联邦学习训练得到第二风险预测模型。具体地,基于目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,包括:基于第一风险预测模型对测试数据的预测值和测试数据对应的真实值,获取第一风险预测模型的拟合误差;基于目标群体用户的私有数据,通过纵向联邦学习训练得到第二风险预测模型,直到第二风险预测模型的预测值逼近第一风险预测模型的拟合误差。
如图4b所示,为本说明书实施例提供的通过纵向联邦学习训练得到第二风险预测模型的示意图。该第二风险预测模型具体是基于赋能机构和目标机构的共同用户,即目标群体用户在赋能机构中的私有数据和目标群体用户在目标机构中的私有数据,以赋能机构无法获知或反推目标机构的数据,且目标机构无法获知或反推赋能机构的数据的前提下,通过纵向联邦学习训练得到的。
本说明书提供的一种或多个实施例,能够基于赋能机构的用户的私有数据通过同构迁移训练得到的第一风险预测模型,对用户的风险等级进行预测;并能够基于赋能机构和目标机构共同的目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,对用户的风险等级进行二次预测,并结合这两次预测结果来确定用户的风险等级。由于充分利用了赋能机构的用户的私有数据、以及赋能机构和目标机构共同的目标群体用户的私有数据,训练得到第一风险预测模型和第二风险预测模型联合对用户的风险等级进行预测,提高了风险等级的预测准确率。
图5是本说明书的一个实施例提供的一种风险预测装置500的结构示意图。请参考图5,在一种软件实施方式中,风险预测装置500可包括以下单元。
获取单元501,从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据。
第一预测单元502,将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的。
第二预测单元503,将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差。
第三预测单元504,基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级。
其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
可选地,在一种实施方式中,所述第三预测单元504,用于:通过加法模型基于所述第一风险等级和所述第二风险等级,确定所述目标用户的风险等级。
风险预测装置500能够实现图1~图2的方法实施例的方法,具体可参考图1~图2所示实施例的风险预测方法,不再赘述。
图6是本说明书的一个实施例提供的一种风险预测模型的训练装置600的结构示意图。请参考图6,在一种软件实施方式中,风险预测模型的训练装置600可包括以下单元。
数据获取单元601,获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户。
第一训练单元602,基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型。
第二训练单元603,基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差。
其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
可选地,在一种实施方式中,所述第一训练单元602,用于:获取所述赋能机构中所述目标群体用户的私有数据;基于所述赋能机构的用户的私有数据和所述赋能机构中所述目标群体用户的私有数据,通过同构迁移训练得到所述第一风险预测模型。
可选地,在一种实施方式中,所述第二训练单元603,用于:基于所述第一风险预测模型对测试数据的预测值和所述测试数据对应的真实值,获取所述第一风险预测模型的拟合误差;基于所述目标群体用户的私有数据,通过纵向联邦学习训练得到第二风险预测模型,直到所述第二风险预测模型的预测值逼近所述第一风险预测模型的拟合误差。
风险预测模型的训练装置600能够实现图3、图4a和图4b的方法实施例的方法,具体可参考图3、图4a和图4b所示实施例的风险预测模型的训练方法,不再赘述。
图7是本说明书的一个实施例电子设备的结构示意图。请参考图7,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图7中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成风险预测装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
上述如本说明书图1~图2所示实施例揭示的风险预测装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本说明书实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本说明书实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
该电子设备还可执行图1~图2的方法,并实现风险预测装置在图1~图2所示实施例的功能,本说明书实施例在此不再赘述。
本说明书实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的便携式电子设备执行时,能够使该便携式电子设备执行图1~图2所示实施例的方法,并具体用于执行以下操作:从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
当然,除了软件实现方式之外,本说明书的电子设备并不排除其他实现方式,比如逻辑器件抑或软硬件结合的方式等等,也就是说以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。
图8是本说明书的一个实施例电子设备的结构示意图。请参考图8,在硬件层面,该电子设备包括处理器,可选地还包括内部总线、网络接口、存储器。其中,存储器可能包含内存,例如高速随机存取存储器(Random-Access Memory,RAM),也可能还包括非易失性存储器(non-volatile memory),例如至少1个磁盘存储器等。当然,该电子设备还可能包括其他业务所需要的硬件。
处理器、网络接口和存储器可以通过内部总线相互连接,该内部总线可以是ISA(Industry Standard Architecture,工业标准体系结构)总线、PCI(Peripheral Component Interconnect,外设部件互连标准)总线或EISA(Extended Industry Standard Architecture,扩展工业标准结构)总线等。所述总线可以分为地址总线、数据总线、控制总线等。为便于表示,图8中仅用一个双向箭头表示,但并不表示仅有一根总线或一种类型的总线。
存储器,用于存放程序。具体地,程序可以包括程序代码,所述程序代码包括计算 机操作指令。存储器可以包括内存和非易失性存储器,并向处理器提供指令和数据。
处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成风险预测模型的训练装置。处理器,执行存储器所存放的程序,并具体用于执行以下操作:获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
上述如本说明书图3、图4a和图4b所示实施例揭示的风险预测模型的训练装置执行的方法可以应用于处理器中,或者由处理器实现。处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器可以是通用处理器,包括中央处理器(Central Processing Unit,CPU)、网络处理器(Network Processor,NP)等;还可以是数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本说明书实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本说明书实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
该电子设备还可执行图3、图4a和图4b的方法,并实现风险预测模型的训练装置在图3、图4a和图4b所示实施例的功能,本说明书实施例在此不再赘述。
本说明书实施例还提出了一种计算机可读存储介质,该计算机可读存储介质存储一个或多个程序,该一个或多个程序包括指令,该指令当被包括多个应用程序的便携式电 子设备执行时,能够使该便携式电子设备执行图3、图4a和图4b所示实施例的方法,并具体用于执行以下操作:获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或步骤可以按照不同于实施例中的顺序来执行并且仍然可以实现期望的结果。另外,在附图中描绘的过程不一定要求示出的特定顺序或者连续顺序才能实现期望的结果。在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。
总之,以上所述仅为本说明书的较佳实施例而已,并非用于限定本说明书的保护范围。凡在本说明书的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本说明书的保护范围之内。
上述实施例阐明的系统、装置、模块或单元,具体可以由计算机芯片或实体实现,或者由具有某种功能的产品来实现。一种典型的实现设备为计算机。具体的,计算机例如可以为个人计算机、膝上型计算机、蜂窝电话、相机电话、智能电话、个人数字助理、媒体播放器、导航设备、电子邮件设备、游戏控制台、平板计算机、可穿戴设备或者这些设备中的任何设备的组合。
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设 备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。

Claims (11)

  1. 一种风险预测方法,包括:
    从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;
    将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;
    将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;
    其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
  2. 如权利要求1所述的方法,基于所述第一风险等级和所述第二风险等级,确定所述目标用户的风险等级,包括:
    通过加法模型基于所述第一风险等级和所述第二风险等级,确定所述目标用户的风险等级。
  3. 一种风险预测模型的训练方法,包括:
    获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;
    基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;
    基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
  4. 如权利要求3所述的方法,基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型,包括:
    获取所述赋能机构中所述目标群体用户的私有数据;
    基于所述赋能机构的用户的私有数据和所述赋能机构中所述目标群体用户的私有数据,通过同构迁移训练得到所述第一风险预测模型。
  5. 如权利要求4所述的方法,基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,包括:
    基于所述第一风险预测模型对测试数据的预测值和所述测试数据对应的真实值,获取所述第一风险预测模型的拟合误差;
    基于所述目标群体用户的私有数据,通过纵向联邦学习训练得到第二风险预测模型,直到所述第二风险预测模型的预测值逼近所述第一风险预测模型的拟合误差。
  6. 一种风险预测装置,包括:
    获取单元,从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;
    第一预测单元,将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;
    第二预测单元,将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    第三预测单元,基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;
    其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
  7. 一种风险预测模型的训练装置,包括:
    数据获取单元,获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数 据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;
    第一训练单元,基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;
    第二训练单元,基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
  8. 一种电子设备,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;
    将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;
    将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;
    其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
  9. 一种计算机可读存储介质,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    从赋能机构的用户数据库中获取与目标用户的用户标识对应的第一私有数据、以及 从目标机构的用户数据库中获取与所述目标用户的用户标识对应的第二私有数据;
    将所述第一私有数据输入到第一风险等级预测模型中,预测得到所述目标用户的第一风险等级,所述第一风险等级预测模型为基于赋能机构的用户的私有数据通过同构迁移训练得到的;
    将所述第一私有数据和所述第二私有数据输入到第二风险等级预测模型中,预测得到所述目标用户的第二风险等级,所述第二风险等级预测模型为基于目标群体用户的私有数据及对应的标签、通过纵向联邦学习训练得到的,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    基于所述第一风险等级和所述第二风险等级,预测所述目标用户的风险等级;
    其中,所述目标群体用户为所述赋能机构和所述目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据。
  10. 一种电子设备,包括:
    处理器;以及
    被安排成存储计算机可执行指令的存储器,所述可执行指令在被执行时使所述处理器执行以下操作:
    获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;
    基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;
    基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
  11. 一种计算机可读存储介质,所述计算机可读存储介质存储一个或多个程序,所述一个或多个程序当被包括多个应用程序的电子设备执行时,使得所述电子设备执行以下操作:
    获取赋能机构的用户的私有数据和目标群体用户的私有数据,其中,所述目标群体 用户为所述赋能机构和目标机构的共同用户,所述目标群体用户的私有数据包括所述目标群体用户在所述赋能机构中的私有数据、和所述目标群体用户在目标机构中的私有数据,所述赋能机构的用户包括所述目标群体用户;
    基于所述赋能机构的用户的私有数据,通过同构迁移训练得到第一风险预测模型;
    基于所述目标群体用户的私有数据及对应的标签,通过纵向联邦学习训练得到第二风险预测模型,所述目标群体用户的私有数据对应的标签为目标群体用户在第一风险预测模型对应的拟合误差;
    其中,所述第一风险预测模型和所述第二风险预测模型用于联合识别用户的风险等级。
PCT/CN2020/124718 2019-11-27 2020-10-29 风险预测和风险预测模型的训练方法、装置及电子设备 WO2021103909A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911179340.4 2019-11-27
CN201911179340.4A CN110956275B (zh) 2019-11-27 2019-11-27 风险预测和风险预测模型的训练方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2021103909A1 true WO2021103909A1 (zh) 2021-06-03

Family

ID=69976974

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124718 WO2021103909A1 (zh) 2019-11-27 2020-10-29 风险预测和风险预测模型的训练方法、装置及电子设备

Country Status (3)

Country Link
CN (1) CN110956275B (zh)
TW (1) TWI764148B (zh)
WO (1) WO2021103909A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018318A (zh) * 2022-06-01 2022-09-06 航天神舟智慧系统技术有限公司 一种社会区域风险预测分析方法与系统
CN115545216A (zh) * 2022-10-19 2022-12-30 上海零数众合信息科技有限公司 一种业务指标预测方法、装置、设备和存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956275B (zh) * 2019-11-27 2021-04-02 支付宝(杭州)信息技术有限公司 风险预测和风险预测模型的训练方法、装置及电子设备
US11429301B2 (en) * 2020-04-22 2022-08-30 Dell Products L.P. Data contextual migration in an information handling system
CN111582565A (zh) * 2020-04-26 2020-08-25 支付宝(杭州)信息技术有限公司 数据融合方法、装置和电子设备
CN111738440B (zh) * 2020-07-31 2020-11-24 支付宝(杭州)信息技术有限公司 一种基于领域自适应与联邦学习的模型训练方法及系统
CN112309529B (zh) * 2020-11-02 2022-11-04 常州市第一人民医院 一种基于人工智能的感染控制管理方法及系统
CN113011632B (zh) * 2021-01-29 2023-04-07 招商银行股份有限公司 企业风险评估方法、装置、设备及计算机可读存储介质
CN113269431B (zh) * 2021-05-20 2023-12-05 深圳易财信息技术有限公司 库存风险预测方法、设备、介质及计算机程序产品

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260703A1 (en) * 2003-06-20 2004-12-23 Elkins Debra A. Quantitative property loss risk model and decision analysis framework
CN108984613A (zh) * 2018-06-12 2018-12-11 北京航空航天大学 一种基于迁移学习的缺陷报告跨项目分类方法
CN109002861A (zh) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 联邦建模方法、设备及存储介质
CN110033120A (zh) * 2019-03-06 2019-07-19 阿里巴巴集团控股有限公司 用于为商户提供风险预测赋能服务的方法及装置
CN110245510A (zh) * 2019-06-19 2019-09-17 北京百度网讯科技有限公司 用于预测信息的方法和装置
CN110399742A (zh) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 一种联邦迁移学习模型的训练、预测方法及装置
CN110956275A (zh) * 2019-11-27 2020-04-03 支付宝(杭州)信息技术有限公司 风险预测和风险预测模型的训练方法、装置及电子设备

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10468137B1 (en) * 2013-06-14 2019-11-05 Cerner Innovation, Inc. Managing treatment of children with Tetralogy of Fallot
CN105321028B (zh) * 2014-08-05 2019-03-15 瑞昶科技股份有限公司 工厂环境风险筛检方法
CN106408282A (zh) * 2016-08-25 2017-02-15 深圳怡化电脑股份有限公司 一种自助终端的转账业务处理方法及系统
CN106530078A (zh) * 2016-11-29 2017-03-22 流量海科技成都有限公司 基于跨行业数据的贷款风险预警方法及系统
CN107450426B (zh) * 2017-08-01 2019-07-30 中国建筑第八工程局有限公司 基坑施工变形统计分析方法及系统
US20190057320A1 (en) * 2017-08-16 2019-02-21 ODH, Inc. Data processing apparatus for accessing shared memory in processing structured data for modifying a parameter vector data structure
US20190114546A1 (en) * 2017-10-12 2019-04-18 Nvidia Corporation Refining labeling of time-associated data
CN108520343B (zh) * 2018-03-26 2022-07-19 平安科技(深圳)有限公司 风险模型训练方法、风险识别方法、装置、设备及介质
CN108895532B (zh) * 2018-04-20 2020-06-30 太原理工大学 基于随机分布控制算法的区域供热节能控制方法
CN109255444B (zh) * 2018-08-10 2022-03-29 深圳前海微众银行股份有限公司 基于迁移学习的联邦建模方法、设备及可读存储介质
CN109165840B (zh) * 2018-08-20 2022-06-21 平安科技(深圳)有限公司 风险预测处理方法、装置、计算机设备和介质
CN109544166B (zh) * 2018-11-05 2023-05-30 创新先进技术有限公司 一种风险识别方法和装置
CN109598414B (zh) * 2018-11-13 2023-04-21 创新先进技术有限公司 风险评估模型训练、风险评估方法、装置及电子设备
CN110245954B (zh) * 2019-05-27 2023-06-27 创新先进技术有限公司 用于风险控制的方法和装置
CN110246032A (zh) * 2019-06-21 2019-09-17 深圳前海微众银行股份有限公司 贷后风险监控方法、装置及计算机可读存储介质
CN110348721A (zh) * 2019-06-29 2019-10-18 北京淇瑀信息科技有限公司 基于gbst的金融违约风险预测方法、装置和电子设备
CN110442457A (zh) * 2019-08-12 2019-11-12 北京大学深圳研究生院 基于联邦学习的模型训练方法、装置及服务器

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040260703A1 (en) * 2003-06-20 2004-12-23 Elkins Debra A. Quantitative property loss risk model and decision analysis framework
CN108984613A (zh) * 2018-06-12 2018-12-11 北京航空航天大学 一种基于迁移学习的缺陷报告跨项目分类方法
CN109002861A (zh) * 2018-08-10 2018-12-14 深圳前海微众银行股份有限公司 联邦建模方法、设备及存储介质
CN110033120A (zh) * 2019-03-06 2019-07-19 阿里巴巴集团控股有限公司 用于为商户提供风险预测赋能服务的方法及装置
CN110245510A (zh) * 2019-06-19 2019-09-17 北京百度网讯科技有限公司 用于预测信息的方法和装置
CN110399742A (zh) * 2019-07-29 2019-11-01 深圳前海微众银行股份有限公司 一种联邦迁移学习模型的训练、预测方法及装置
CN110956275A (zh) * 2019-11-27 2020-04-03 支付宝(杭州)信息技术有限公司 风险预测和风险预测模型的训练方法、装置及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115018318A (zh) * 2022-06-01 2022-09-06 航天神舟智慧系统技术有限公司 一种社会区域风险预测分析方法与系统
CN115545216A (zh) * 2022-10-19 2022-12-30 上海零数众合信息科技有限公司 一种业务指标预测方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN110956275A (zh) 2020-04-03
CN110956275B (zh) 2021-04-02
TWI764148B (zh) 2022-05-11
TW202121268A (zh) 2021-06-01

Similar Documents

Publication Publication Date Title
WO2021103909A1 (zh) 风险预测和风险预测模型的训练方法、装置及电子设备
TWI699652B (zh) 虛假地址資訊識別的方法及裝置
TWI743773B (zh) 基於隱私資料保護的異常採集行為識別方法和裝置
CN109347787B (zh) 一种身份信息的识别方法及装置
WO2021223675A1 (zh) 风险检测
TW202011251A (zh) 用戶身分確定方法、裝置及電子設備
CN110119860B (zh) 一种垃圾账号检测方法、装置以及设备
WO2021259147A1 (zh) 基于区块链的资源转移方法、装置及设备
WO2021196780A1 (zh) 一种基于区块链的风控方法和系统
CN109271611B (zh) 一种数据校验方法、装置及电子设备
CN110245980B (zh) 基于神经网络模型确定目标用户激励形式的方法和设备
WO2020119284A1 (zh) 一种用户准入的风险确定方法及装置
WO2021189926A1 (zh) 业务模型训练方法、装置、系统及电子设备
WO2022237574A1 (zh) 权益分享处理的方法及装置
JP2013058192A (ja) 区画を評価するシステム、方法、及びコンピュータプログラム製品
CN108550046A (zh) 一种资源和营销推荐方法、装置及电子设备
WO2020164331A1 (zh) 理赔业务的处理方法及装置
WO2018095307A1 (zh) 一种评价信息的发布方法及装置
TWI713019B (zh) 資料標籤產生、模型訓練、事件識別方法和裝置
CN111582872A (zh) 异常账号检测模型训练、异常账号检测方法、装置及设备
WO2020108136A1 (zh) 业务的处理方法及装置
CN110334936B (zh) 一种信贷资质评分模型的构建方法、装置和设备
CN111275071B (zh) 预测模型训练、预测方法、装置及电子设备
CN113297462A (zh) 数据处理方法、装置、设备和存储介质
WO2019154130A1 (zh) 任务的运行方法、装置和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20893532

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20893532

Country of ref document: EP

Kind code of ref document: A1