CN111583011A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN111583011A
CN111583011A CN201910124234.XA CN201910124234A CN111583011A CN 111583011 A CN111583011 A CN 111583011A CN 201910124234 A CN201910124234 A CN 201910124234A CN 111583011 A CN111583011 A CN 111583011A
Authority
CN
China
Prior art keywords
event
sample data
data
user
user characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910124234.XA
Other languages
Chinese (zh)
Inventor
董健
常富洋
颜水成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qihoo Technology Co Ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201910124234.XA priority Critical patent/CN111583011A/en
Publication of CN111583011A publication Critical patent/CN111583011A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Evolutionary Biology (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Algebra (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the specification provides a data processing method, a data processing device, data processing equipment and a storage medium. The method comprises the following steps: training by using first user characteristic sample data to obtain an event overdue probability model; acquiring a plurality of second user characteristic sample data, and acquiring an event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model; obtaining second user characteristic sample data with a plurality of event overdue probabilities larger than a set overdue probability threshold value as third user characteristic sample data, and grouping the third user characteristic sample data; respectively acquiring a pre-estimated reward value and a pre-estimated uncertainty value of each group of third user characteristic sample data by using an enhanced learning model; selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the predicted reward value and the predicted uncertainty; adjusting the event overdue probability threshold with the selected event overdue probability. The embodiment of the invention can accurately adjust the overdue probability threshold.

Description

Data processing method, device, equipment and storage medium
Technical Field
The embodiments of the present disclosure relate to the field of data processing technologies, and in particular, to a data processing method, an apparatus, a device, and a storage medium.
Background
In recent years, internet finance has been vigorously developed. One of the financial cores of the internet is the loan transaction, the most important being the determination of whether to loan a user. At present, a mainstream payment overdue model is usually obtained by performing supervised learning training by using user characteristic data of users as samples based on users with existing payment performance. The model training mode and the obtained payment overdue model are difficult to accurately evaluate users refused to pay or having no payment performance, so that samples optimally used by subsequent models are positive samples, the models are over-converged, and the accuracy of the models is reduced.
Disclosure of Invention
The embodiment of the specification provides a data processing method, a data processing device, data processing equipment and a storage medium, and through intelligent adjustment of an event overdue probability threshold, user downward exploration can be effectively conducted, model optimization samples are enriched, and model accuracy is improved.
In a first aspect, an embodiment of the present specification provides a data processing method, including:
acquiring a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by using the first user characteristic sample data to obtain an event overdue probability model;
acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data is label-free sample data, and acquiring event overdue probability corresponding to each second user characteristic sample data by using an event overdue probability model;
acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by using an enhanced learning model;
selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the predicted reward value and the predicted uncertainty;
adjusting the event overdue probability threshold with the selected event overdue probability.
With reference to the first aspect, in a first implementation manner of the first aspect of the embodiments of the present invention, the augmented learning model includes a linear model and a contextual gambling machine, and the obtaining, by using the augmented learning model, the estimated reward value and the estimated uncertainty value of each third user feature sample data respectively includes:
respectively acquiring the pre-estimated reward corresponding to each third user characteristic sample data by using the linear model;
and respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
With reference to the first implementation manner of the first aspect, in a second implementation manner of the first aspect of the embodiment of the present invention, the obtaining, by using the linear model, the predicted revenue corresponding to each third user feature sample data includes:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
With reference to the second implementation manner of the first aspect, in a third implementation manner of the first aspect of the embodiment of the present invention, the event characteristic data includes at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
With reference to the first implementation manner of the first aspect, in a fourth implementation manner of the first aspect of the embodiment of the present invention, the respectively obtaining, by using the contextual gambling machine, estimated uncertainty values corresponding to each third user feature data includes:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
With reference to the first aspect, the first implementation manner of the first aspect, the second implementation manner of the first aspect, the third implementation manner of the first aspect, or the fourth implementation manner of the first aspect, in a fifth implementation manner of the first aspect of an embodiment of the present invention, the first user feature sample data and/or the second user feature sample data are obtained by using a user portrait model, and the user portrait model is obtained by training using at least one of the following feature samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
With reference to the first aspect, the first implementation manner of the first aspect, the second implementation manner of the first aspect, the third implementation manner of the first aspect, or the fourth implementation manner of the first aspect, in a sixth implementation manner of the first aspect of the embodiment of the present invention, the method further includes:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
the model training module is used for acquiring a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by utilizing the first user characteristic sample data to obtain an event overdue probability model;
the probability pre-estimation module is used for acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data are label-free sample data, and acquiring the event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
the sample acquisition module is used for acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
the reinforcement learning module is used for respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by utilizing a reinforcement learning model;
the probability selection module is used for selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the pre-estimated reward value and the pre-estimated uncertainty;
and the probability adjusting module is used for adjusting the event overdue probability threshold by using the selected event overdue probability.
In combination with the second aspect, the reinforcement learning model includes a linear model and a contextual gambling machine, the reinforcement learning module including:
the linear model module is used for respectively acquiring the pre-estimated reward corresponding to the feature sample data of each third user by utilizing the linear model;
and the context gambling machine module is used for respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
With reference to the first implementation manner of the second aspect, in a second implementation manner of the second aspect of the embodiment of the present invention, the linear model module is configured to:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
With reference to the second implementation manner of the second aspect, in a third implementation manner of the second aspect of the embodiment of the present invention, the event characteristic data includes at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
With reference to the first implementation manner of the second aspect, in a fourth implementation manner of the second aspect of the embodiment of the present invention, the contextual gambling machine module is configured to:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
With reference to the second aspect, the first implementation manner of the second aspect, the second implementation manner of the second aspect, the third implementation manner of the second aspect, or the fourth implementation manner of the second aspect, in a fifth implementation manner of the second aspect of the present invention, the first user feature sample data and/or the second user feature sample data are obtained by using a user representation model, and the user representation model is trained by using at least one of the following feature samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
With reference to the second aspect, the first implementation manner of the second aspect, the second implementation manner of the second aspect, the third implementation manner of the second aspect, or the fourth implementation manner of the second aspect, in a sixth implementation manner of the second aspect of the embodiment of the present invention, the apparatus further includes an event prediction module, configured to:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
In a third aspect, an embodiment of the present invention further provides a computer device, including a processor and a memory:
the memory is used for storing a program for executing the method according to each implementation of the first aspect,
the processor is configured to execute programs stored in the memory.
In a fourth aspect, an embodiment of the present invention further provides a computer storage medium for storing computer software instructions for the computer device according to the third aspect.
The embodiment of the specification has the following beneficial effects:
in the embodiment of the invention, an event overdue probability model is trained, the event overdue probabilities of a plurality of second user characteristic sample data are obtained by utilizing the model, the user characteristic sample data exceeding an event overdue probability threshold value are selected for reinforcement learning, and the event overdue probability with the maximum sum of the estimated reward value and the estimated uncertain value obtained by the reinforcement learning is selected for adjusting the event overdue probability threshold value. The method has the advantages that the intelligent adjustment of the event overdue probability is realized, the convergence speed of the adjustment mode is high, samples of the overdue probability model can be effectively supplemented, the prediction accuracy of the overdue probability model is improved, and accurate assessment can be provided for users without deposit performance and users refused to deposit.
Drawings
Fig. 1 is a schematic view of a scenario in which the method of the first aspect of the embodiment of the present invention is applied;
FIG. 2 is a flow chart of a method according to an embodiment of the first aspect of the present invention;
FIG. 3 is a method flow diagram of another embodiment of the first aspect of the present invention;
fig. 4 is a schematic structural diagram of an apparatus according to a second aspect of the embodiment of the present invention.
Detailed Description
In order to better understand the technical solutions, the technical solutions of the embodiments of the present specification are described in detail below with reference to the drawings and specific embodiments, and it should be understood that the specific features of the embodiments and embodiments of the present specification are detailed descriptions of the technical solutions of the embodiments of the present specification, and are not limitations of the technical solutions of the present specification, and the technical features of the embodiments and embodiments of the present specification may be combined with each other without conflict.
The embodiments of the present description may be implemented on the trust system shown in fig. 1. In fig. 1, a client application of a credit granting system is installed on a user terminal 101, and after a user calls the client application, the user terminal 101 communicates with a server 102 and completes a corresponding task. For example, to implement the method provided by the embodiment of the present invention, the client application sends an event evaluation request message to the server 102 through the user terminal 101, after receiving the message, the server 102 searches for user feature data of a target user according to identification information of the target user carried in the message, uses the user feature data as an input value of the event overdue probability model, obtains the event overdue probability of the target user by using the event overdue probability model, compares the event overdue probability with the event overdue probability threshold to obtain a comparison result, and sends an event evaluation response message to the user terminal 101, where the event evaluation response message carries information representing the comparison result.
The event overdue probability threshold is dynamically adjusted, and the adjustment method will be described in detail in the following embodiments.
In a first aspect, an embodiment of the present disclosure provides a data processing method, please refer to fig. 2, including:
step 201, obtaining a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by using the first user characteristic sample data to obtain an event overdue probability model.
In the embodiment of the present invention, the first user characteristic sample data may be positive sample data (taking whether to allow loan in the credit granting event as an example, and a positive sample refers to a sample that allows loan), or may be negative sample data (taking whether to allow loan in the credit granting event as an example, and a negative sample refers to a sample that refuses loan).
In this embodiment of the present invention, the first user characteristic sample data includes a plurality of user characteristics, which may be, but is not limited to, a vector composed of a plurality of user characteristics.
Step 202, obtaining a plurality of second user characteristic sample data, wherein the second user characteristic sample data are non-label sample data, and obtaining an event overdue probability corresponding to each second user characteristic sample data by using an event overdue probability model.
In this embodiment of the present invention, the second user characteristic sample data includes a plurality of user characteristics, which may be, but is not limited to, a vector composed of a plurality of user characteristics.
In this embodiment of the present invention, the second user feature sample data may be feature data of a user who is rejected to deposit money, or may be feature data of a user who does not exhibit deposit money. The above "no label" does not mean that the samples must not have a label, but that no label is required for subsequent processing.
Step 203, obtaining a plurality of second user characteristic sample data with the event overdue probability larger than the set overdue probability threshold as third user characteristic sample data.
In the embodiment of the invention, the overdue probability threshold is preset, can be set manually, and can also be set in a simulation mode, a fitting mode and the like. In the embodiment of the present invention, the expected probability threshold is adjustable, and the method provided in the embodiment of the present invention is specifically adopted for adjustment.
The event overdue probability is greater than the expected probability threshold, meaning that the corresponding user should be denied the deposit. However, the method provided by the embodiment of the invention utilizes the reinforcement learning algorithm to continuously learn the samples, and performs user downward exploration, thereby adjusting the threshold value.
In the embodiment of the present invention, all the second user characteristic sample data that meets the above conditions may be selected as the third user characteristic sample data. However, in order to improve the accuracy and convergence rate of the algorithm, some sample data may be selected. The selection mode is not limited in the embodiment of the present invention, for example, a predetermined number of second user feature sample data may be randomly selected as third user feature sample data, or second user feature sample data whose certain feature (e.g., revenue) satisfies a single feature condition may be selected as third user feature sample data.
And 204, respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by using the reinforcement learning model.
The reward is a proper noun in the reinforcement learning algorithm, and in the embodiment of the invention, the reward value can be, but is not limited to, representing the benefit of paying.
In the embodiment of the invention, the utilization rate of the income credit line is in direct proportion and is in inverse proportion to the bad account amount.
In the embodiment of the invention, the uncertainty value represents the size of the bad account possibility and/or the size of the movable branch possibility, the smaller the uncertainty value is, the smaller the bad account possibility is represented and/or the movable branch possibility is greater, and the larger the uncertainty value is, the larger the bad account possibility is represented and/or the movable branch possibility is represented.
And step 205, selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the estimated reward value and the estimated uncertainty.
The maximum sum of the predicted reward value and the predicted uncertainty means that a payout is allowed for the user at the probability that the event is overdue.
And step 206, adjusting the event overdue probability threshold by using the selected event overdue probability.
In the embodiment of the present invention, the implementation manner of step 206 is various, for example, the event overdue probability threshold is replaced by the selected event overdue probability, and for example, a mean value of the selected event overdue probability and the event overdue probability threshold before adjustment is used as the event overdue probability threshold after adjustment, and the like.
In the embodiment of the invention, an event overdue probability model is trained, the event overdue probabilities of a plurality of second user characteristic sample data are obtained by utilizing the model, the user characteristic sample data exceeding an event overdue probability threshold value are selected for reinforcement learning, and the event overdue probability with the maximum sum of the estimated reward value and the estimated uncertain value obtained by the reinforcement learning is selected for adjusting the event overdue probability threshold value. The method has the advantages that the intelligent adjustment of the event overdue probability is realized, the convergence speed of the adjustment mode is high, samples of the overdue probability model can be effectively supplemented, the prediction accuracy of the overdue probability model is improved, and accurate assessment can be provided for users without deposit performance and users refused to deposit.
In the method provided by the embodiment of the present invention, there are various implementation manners of the step 204, that is, the step 204 can be implemented by using various reinforcement learning models. Preferably, the reinforcement learning model includes a linear model and a context gambling machine, and accordingly, the step 204 is implemented as follows: respectively acquiring a pre-estimated reward value corresponding to each third user characteristic sample data by using a linear model; and respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
The embodiment of the invention does not limit the linear model and the concrete model structure and the training method of the context gambling machine.
Any linear model obtained by training in a mode of reinforcement learning linear fitting by using sample data (including credit granting characteristic data and the like) corresponding to the completed credit granting evaluation (namely, whether to issue a loan) can be used in the method provided by the embodiment of the invention. Wherein, the more the sample data, the more accurate the training result.
Any contextual gambling machine trained using sample data (including credit status data) corresponding to a completed credit assessment (i.e., whether to loan) may be used in the methods provided by embodiments of the present invention. Wherein the more sample data, the less uncertainty.
In the embodiment of the present invention, the implementation manner of respectively obtaining the pre-estimated rewards corresponding to each third user feature sample data by using the linear model may be: acquiring event characteristic data of each third user characteristic sample data; and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
The implementation manner of obtaining the event feature data of the third user feature data may be: and searching the event characteristic data from the local database, if the event characteristic data is searched, acquiring the searched event characteristic data, and if the event characteristic data is not searched, searching and acquiring the event characteristic data through a third party database (such as a people bank database), and storing the acquired event characteristic data into the local database.
In an embodiment of the present invention, the pre-estimated uncertainty values corresponding to the feature sample data of each third user are respectively obtained by using the context gambling machine, and the implementation manner of the pre-estimated uncertainty values may be as follows: acquiring event state data corresponding to each third user characteristic data; and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
In an embodiment of the present invention, the event feature data includes at least one of the following: the data of the area where the user is located, the data of the income of the user and the data of the academic calendar of the user.
The data of the area where the user is located can be coded data obtained by coding the area where the user is located, and the area where the user is located can be but is not limited to a city where the user is located;
the user revenue data may be, but is not limited to, a total revenue value for the user over a predetermined time period;
the user study data may be encoded data obtained by encoding the user study.
In the embodiment of the invention, the event state data is data capable of reflecting the event state of the user, and the selection of the data is not limited by the invention.
In any of the above method embodiments, the user feature sample data may be, but is not limited to, obtained through a user representation model. The user representation model may be trained using, but is not limited to, at least one of the following user features: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
On the basis of any of the above method embodiments, the deposit request of the user may be evaluated by using the adjusted threshold. Specifically, as shown in fig. 3, the method includes the following operations:
step 301, receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
step 302, searching user characteristic data of the target user by using the identification information;
step 303, taking the user characteristic data as an input value of the event overdue probability model, and acquiring the event overdue probability of the target user by using the event overdue probability model;
step 304, comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
step 305, sending an event evaluation response message to the target user terminal, where the event evaluation response message carries information indicating the comparison result.
In the embodiment of the present invention, the identification information of the user is identification information of the user, such as an identification number, a passport number, or a combination of a name and a telephone number.
The user opens the credit granting system client application program on the user terminal, and if the performed operation needs credit granting evaluation, the client application program sends a credit granting evaluation request message (i.e. an event evaluation request message) to the server through the communication module of the user terminal.
Optionally, the credit assessment request message may also carry data of a region where the user is located, user income data, user academic data, and the like.
In the embodiment of the invention, the server can be an independent server or a cloud server. If the server is an independent server, the local database may be a database disposed on a disk storage space of the independent server, or a database disposed on a database server allocated to the independent server. If the cloud server is used, the local database can be a database arranged on any node on the cloud server.
For the loan problem, for the non-paying user, it is impossible to obtain the label of overdue, so that the supervised learning cannot be directly performed. To this end, the embodiment of the present invention provides a framework based on reinforcement learning. Firstly, modeling is carried out on a user according to various information such as pedestrian data, consumption, call records, geographic positions and app using conditions of the user, and the overdue probability of the user is deduced on the basis. For the rejected users, the predicted overdue probability is partitioned, and the overdue rate and uncertainty of different intervals are predicted respectively. The reward is the income in the interval and is a linear model of multiple characteristics of the city, income, academic calendar and the like of the user. The uncertainty is predicted by the context gambler algorithm. Therefore, a method for guiding the user to go down is obtained theoretically, payment is paid according to the reward, the uncertainty and the maximum interval, the condition of rejecting the user can be systematically explored, and therefore the model is optimized, and the change of the crowd is adapted more quickly under the condition that the risk is controllable.
In a second aspect, an embodiment of the present invention discloses a data processing apparatus, please refer to fig. 4, including:
the model training module 401 is configured to acquire a plurality of first user feature sample data, where the first user feature sample data are labeled sample data, and train the sample data by using the first user feature sample data to obtain an event overdue probability model;
a probability pre-estimating module 402, configured to obtain multiple second user characteristic sample data, where the second user characteristic sample data is non-tag sample data, and obtain an event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
a sample obtaining module 403, configured to obtain second user feature sample data with a plurality of event overdue probabilities greater than a set overdue probability threshold as third user feature sample data;
the reinforcement learning module 404 is configured to obtain a prediction reward value and a prediction uncertainty value of each third user feature sample data by using a reinforcement learning model;
a probability selection module 405, configured to select an event overdue probability of the third user feature sample data with the largest sum of the predicted reward value and the predicted uncertainty;
a probability adjustment module 406 configured to adjust the event overdue probability threshold using the selected event overdue probability.
Optionally, the reinforcement learning model includes a linear model and a contextual gambling machine, and the reinforcement learning module includes:
the linear model module is used for respectively acquiring the pre-estimated reward corresponding to the feature sample data of each third user by utilizing the linear model;
and the context gambling machine module is used for respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
Optionally, the linear model module is configured to:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
With reference to the second implementation manner of the second aspect, in a third implementation manner of the second aspect of the embodiment of the present invention, the event characteristic data includes at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
Optionally, the contextual gambling machine module is configured to:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
Optionally, the first user feature sample data and/or the second user feature sample data are obtained by using a user portrait model, and the user portrait model is obtained by training using at least one of the following feature samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
Optionally, the apparatus further includes an event prediction module, configured to:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
In a third aspect, an embodiment of the present invention further provides a computer device, including a processor and a memory:
the memory is used for storing a program for executing the method according to each implementation of the first aspect,
the processor is configured to execute programs stored in the memory.
In a fourth aspect, an embodiment of the present invention further provides a computer storage medium for storing computer software instructions for the computer device according to the third aspect.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present specification have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all changes and modifications that fall within the scope of the specification.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present specification without departing from the spirit and scope of the specification. Thus, if such modifications and variations of the present specification fall within the scope of the claims of the present specification and their equivalents, the specification is intended to include such modifications and variations.
The embodiment of the invention discloses:
a1, a data processing method, comprising:
obtaining a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by using the first user characteristic sample data to obtain an event overdue probability model;
acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data is label-free sample data, and acquiring an event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by using an enhanced learning model;
selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the predicted reward value and the predicted uncertainty;
adjusting the event overdue probability threshold with the selected event overdue probability.
A2, the method according to a1, wherein the reinforcement learning model includes a linear model and a context gambling machine, and the obtaining of the predicted reward value and the predicted uncertainty value of each third user feature sample data by using the reinforcement learning model respectively includes:
respectively acquiring the pre-estimated reward corresponding to each third user characteristic sample data by using the linear model;
and respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
A3, according to the method of A2, the respectively obtaining the pre-estimated income corresponding to each third user feature sample data by using the linear model includes:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
A4, the method of A3, the event profile data comprising at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
A5, the method according to A2, wherein the obtaining the estimated uncertainty value corresponding to each third user feature data by the context gambling machine comprises:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
A6, the method of any one of A1 to A5, wherein the first user characteristic sample data and/or the second user characteristic sample data are obtained by using a user profile model trained using at least one of the following characteristic samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
A7, the method of any one of A1 to A5, further comprising:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
B8, a data processing apparatus comprising:
the model training module is used for acquiring a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by utilizing the first user characteristic sample data to obtain an event overdue probability model;
the probability pre-estimation module is used for acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data are label-free sample data, and acquiring the event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
the sample acquisition module is used for acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
the reinforcement learning module is used for respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by utilizing a reinforcement learning model;
the probability selection module is used for selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the pre-estimated reward value and the pre-estimated uncertainty;
and the probability adjusting module is used for adjusting the event overdue probability threshold by using the selected event overdue probability.
B9, the apparatus of B8, the reinforcement learning model comprising a linear model and a contextual gambling machine, the reinforcement learning module comprising:
the linear model module is used for respectively acquiring the pre-estimated reward corresponding to the feature sample data of each third user by utilizing the linear model;
and the context gambling machine module is used for respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
B10, the apparatus of B9, the linear model module to:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
B11, the apparatus according to B10, the event characteristic data comprising at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
B12, the apparatus of B9, the contextual gambling machine module to:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
B13, the apparatus according to any of B8-12, wherein the first user characteristic sample data and/or the second user characteristic sample data are obtained by using a user profile model, and the user profile model is trained by using at least one of the following characteristic samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
B14, the device according to any one of B8-12, the device further comprising an event pre-estimation module for:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
C15, a computer device comprising a processor and a memory:
the memory is for storing a program for executing the method of any one of C1 to C7,
the processor is configured to execute programs stored in the memory.
D16, a computer storage medium storing computer software instructions for use with the computer apparatus of C16 described above.

Claims (10)

1. A data processing method, comprising:
obtaining a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by using the first user characteristic sample data to obtain an event overdue probability model;
acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data is label-free sample data, and acquiring an event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by using an enhanced learning model;
selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the predicted reward value and the predicted uncertainty;
adjusting the event overdue probability threshold with the selected event overdue probability.
2. The method of claim 1, wherein the reinforcement learning model includes a linear model and a context gambling machine, and the obtaining the predicted reward value and the predicted uncertainty value of each third user feature sample data by using the reinforcement learning model respectively comprises:
respectively acquiring the pre-estimated reward corresponding to each third user characteristic sample data by using the linear model;
and respectively acquiring the estimated uncertainty value corresponding to the feature sample data of each third user by using the context gambling machine.
3. The method according to claim 2, wherein the obtaining the pre-estimated profit for each third user feature sample data by using the linear model respectively comprises:
acquiring event characteristic data of each third user characteristic sample data;
and respectively acquiring the pre-estimated rewards corresponding to the feature sample data of each third user by using the linear model by taking the respective event feature data and the respective event overdue probability of the feature sample data of each third user as input values.
4. The method of claim 3, wherein the event profile data comprises at least one of:
the data of the area where the target object is located, the income data of the target object and the academic data of the target object.
5. The method of claim 2, wherein the obtaining, using the contextual gambling machine, the predicted uncertainty value for each third user characteristic data separately comprises:
acquiring event state data corresponding to each third user characteristic data;
and respectively acquiring the estimated uncertainty value corresponding to each adjustment coefficient interval by using the context gambling machine by taking the respective credit granting state data and the respective event overdue probability of the third user characteristic data as input values.
6. The method according to any of claims 1 to 5, wherein the first user feature sample data and/or the second user feature sample data is obtained using a user representation model trained using at least one of the following feature samples: pedestrian data, consumption data, call record data, geographic position data and app use condition data.
7. The method according to any one of claims 1 to 5, further comprising:
receiving an event evaluation request message sent by a target user terminal, wherein the event evaluation request message carries identification information of a target user;
searching user characteristic data of the target user by using the identification information;
the user characteristic data is used as an input value of the event overdue probability model, and the event overdue probability of the target user is obtained by using the event overdue probability model;
comparing the event overdue probability with the adjusted event overdue probability threshold to obtain a comparison result;
and sending an event evaluation response message to the target user terminal, wherein the event evaluation response message carries information representing the comparison result.
8. A data processing apparatus, comprising:
the model training module is used for acquiring a plurality of first user characteristic sample data, wherein the first user characteristic sample data are labeled sample data, and training by utilizing the first user characteristic sample data to obtain an event overdue probability model;
the probability pre-estimation module is used for acquiring a plurality of second user characteristic sample data, wherein the second user characteristic sample data are label-free sample data, and acquiring the event overdue probability corresponding to each second user characteristic sample data by using the event overdue probability model;
the sample acquisition module is used for acquiring second user characteristic sample data of which the overdue probabilities of a plurality of events are greater than a set overdue probability threshold value as third user characteristic sample data;
the reinforcement learning module is used for respectively acquiring the estimated reward value and the estimated uncertainty value of each third user characteristic sample data by utilizing a reinforcement learning model;
the probability selection module is used for selecting the event overdue probability of the third user characteristic sample data with the maximum sum of the pre-estimated reward value and the pre-estimated uncertainty;
and the probability adjusting module is used for adjusting the event overdue probability threshold by using the selected event overdue probability.
9. A computer device, comprising a processor and a memory:
the memory for storing a program for performing the method of any one of claims 1 to 7,
the processor is configured to execute programs stored in the memory.
10. A computer storage medium storing computer software instructions for use by the computer apparatus of claim 9.
CN201910124234.XA 2019-02-18 2019-02-18 Data processing method, device, equipment and storage medium Pending CN111583011A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910124234.XA CN111583011A (en) 2019-02-18 2019-02-18 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910124234.XA CN111583011A (en) 2019-02-18 2019-02-18 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN111583011A true CN111583011A (en) 2020-08-25

Family

ID=72122516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910124234.XA Pending CN111583011A (en) 2019-02-18 2019-02-18 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111583011A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004054638A (en) * 2002-07-19 2004-02-19 Honda Motor Co Ltd Cross modal learning device and method for recognition processing
US7783565B1 (en) * 2006-11-08 2010-08-24 Fannie Mae Method and system for assessing repurchase risk
US7933796B1 (en) * 2004-11-03 2011-04-26 Fannie Mae Method of and system for evaluating credit risk associated with a financial asset
US20140095201A1 (en) * 2012-09-28 2014-04-03 Siemens Medical Solutions Usa, Inc. Leveraging Public Health Data for Prediction and Prevention of Adverse Events
US20150019458A1 (en) * 2013-07-10 2015-01-15 International Business Machines Corporation Multistage optimization of asset health versus costs to meet operation targets
US20150095271A1 (en) * 2012-06-21 2015-04-02 Thomson Licensing Method and apparatus for contextual linear bandits
US20150310068A1 (en) * 2014-04-29 2015-10-29 Catalyst Repository Systems, Inc. Reinforcement Learning Based Document Coding
CN106296391A (en) * 2016-08-08 2017-01-04 联动优势科技有限公司 A kind of assessment exceeds the time limit the method and apparatus of probability
JP2018013827A (en) * 2016-07-19 2018-01-25 株式会社リクルートホールディングス Incentive recipient determining system, and program
CN108596255A (en) * 2018-04-25 2018-09-28 苏州大学 Take into account the prediction of result grader of the context-aware study of fairness
CN108985638A (en) * 2018-07-25 2018-12-11 腾讯科技(深圳)有限公司 A kind of customer investment methods of risk assessment and device and storage medium
CN109146661A (en) * 2018-07-04 2019-01-04 深圳市买买提信息科技有限公司 User type prediction technique, device, electronic equipment and storage medium
CN109146122A (en) * 2018-06-27 2019-01-04 深圳市买买提信息科技有限公司 A kind of probability forecasting method, device, electronic equipment and computer storage medium
CN109327252A (en) * 2018-10-30 2019-02-12 电子科技大学 It is a kind of based on on-line study beam selection method from the context

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004054638A (en) * 2002-07-19 2004-02-19 Honda Motor Co Ltd Cross modal learning device and method for recognition processing
US7933796B1 (en) * 2004-11-03 2011-04-26 Fannie Mae Method of and system for evaluating credit risk associated with a financial asset
US7783565B1 (en) * 2006-11-08 2010-08-24 Fannie Mae Method and system for assessing repurchase risk
US20150095271A1 (en) * 2012-06-21 2015-04-02 Thomson Licensing Method and apparatus for contextual linear bandits
US20140095201A1 (en) * 2012-09-28 2014-04-03 Siemens Medical Solutions Usa, Inc. Leveraging Public Health Data for Prediction and Prevention of Adverse Events
US20150019458A1 (en) * 2013-07-10 2015-01-15 International Business Machines Corporation Multistage optimization of asset health versus costs to meet operation targets
US20150310068A1 (en) * 2014-04-29 2015-10-29 Catalyst Repository Systems, Inc. Reinforcement Learning Based Document Coding
JP2018013827A (en) * 2016-07-19 2018-01-25 株式会社リクルートホールディングス Incentive recipient determining system, and program
CN106296391A (en) * 2016-08-08 2017-01-04 联动优势科技有限公司 A kind of assessment exceeds the time limit the method and apparatus of probability
CN108596255A (en) * 2018-04-25 2018-09-28 苏州大学 Take into account the prediction of result grader of the context-aware study of fairness
CN109146122A (en) * 2018-06-27 2019-01-04 深圳市买买提信息科技有限公司 A kind of probability forecasting method, device, electronic equipment and computer storage medium
CN109146661A (en) * 2018-07-04 2019-01-04 深圳市买买提信息科技有限公司 User type prediction technique, device, electronic equipment and storage medium
CN108985638A (en) * 2018-07-25 2018-12-11 腾讯科技(深圳)有限公司 A kind of customer investment methods of risk assessment and device and storage medium
CN109327252A (en) * 2018-10-30 2019-02-12 电子科技大学 It is a kind of based on on-line study beam selection method from the context

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
KU-CHUN CHOU 等: "Pseudo-reward Algorithms for Contextual Bandits with Linear Payoff Functions", 《PROCEEDINGS OF THE SIXTH ASIAN CONFERENCE ON MACHINE LEARNING》, vol. 39, 31 December 2015 (2015-12-31), pages 344 - 359 *
姜明辉;袁绪川;: "基于后验概率的个人信用评估SVM模型", 四川大学学报(工程科学版), no. 1, 15 May 2007 (2007-05-15), pages 10 - 13 *
孙若莹;范厚明;赵刚;: "基于强化学习的非线性时间序列智能预测模型", 大连海事大学学报, no. 04, 15 November 2017 (2017-11-15), pages 97 - 103 *
王克道;陈启鑫;郭鸿业;夏清;: "面向可交易能源的储能容量合约机制设计与交易策略", 电力系统自动化, no. 14, 11 June 2018 (2018-06-11), pages 54 - 60 *
王瑞: "小额贷款逾期客户还款概率预测模型", 《中国优秀硕士学位论文全文数据库-经济与管理科学辑》, no. 11, 15 November 2018 (2018-11-15), pages 162 - 24 *

Similar Documents

Publication Publication Date Title
CN110033314B (en) Advertisement data processing method and device
CN111681091B (en) Financial risk prediction method and device based on time domain information and storage medium
CN111275491A (en) Data processing method and device
CN113011895B (en) Associated account sample screening method, device and equipment and computer storage medium
CN108416619B (en) Consumption interval time prediction method and device and readable storage medium
CN110827143A (en) Method, device and equipment for training credit scoring model
CN113362852A (en) User attribute identification method and device
CN116468444A (en) Consumption early warning method, system, equipment and storage medium
CN111583010A (en) Data processing method, device, equipment and storage medium
CN117036001A (en) Risk identification processing method, device and equipment for transaction service and storage medium
EP4012642A2 (en) Blockchain-based outlet site selection method and apparatus, device and storage medium
CN111583011A (en) Data processing method, device, equipment and storage medium
CN114723455A (en) Service processing method and device, electronic equipment and storage medium
CN116823264A (en) Risk identification method, risk identification device, electronic equipment, medium and program product
CN116344064A (en) Method and device for acquiring access quantity prediction model and predicting access quantity of journey information query service, electronic equipment and storage medium
CN110009159A (en) Financial Loan Demand prediction technique and system based on network big data
CN112966968B (en) List distribution method based on artificial intelligence and related equipment
CN114298825A (en) Method and device for extremely evaluating repayment volume
KR102177392B1 (en) User authentication system and method based on context data
CN114239985A (en) Exchange rate prediction method and device, electronic equipment and storage medium
CN113723795B (en) Information delivery strategy testing method and device, storage medium and electronic equipment
CN113935826B (en) Credit account management method and system based on user privacy
CN113391923B (en) System resource data allocation method and device
CN110334351B (en) Method and device for recommending network credit based on short message reading
CN117876089A (en) Asset processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination