CN114549174A - User behavior prediction method and device, computer equipment and storage medium - Google Patents

User behavior prediction method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN114549174A
CN114549174A CN202210141838.7A CN202210141838A CN114549174A CN 114549174 A CN114549174 A CN 114549174A CN 202210141838 A CN202210141838 A CN 202210141838A CN 114549174 A CN114549174 A CN 114549174A
Authority
CN
China
Prior art keywords
user
user information
support vector
vector machine
prediction model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210141838.7A
Other languages
Chinese (zh)
Inventor
汪志艺
王伟权
杨俊勉
吴佳文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210141838.7A priority Critical patent/CN114549174A/en
Publication of CN114549174A publication Critical patent/CN114549174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Technology Law (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application relates to a user behavior prediction method, a user behavior prediction device, computer equipment, a storage medium and a computer program product, and relates to the technical field of big data. The method comprises the steps of obtaining user information in a service application request by receiving the service application request sent by a user to be predicted, inputting user information of multiple dimensions into a pre-trained target support vector machine prediction model, mapping the user information of the multiple dimensions from a low-dimensional space to a high-dimensional space corresponding to the dimensions of the user information through the target support vector machine prediction model, carrying out nonlinear classification and outputting a predicted value, and determining whether the user to be predicted can have set risk behaviors associated with the service or not according to the predicted value. Compared with the traditional analysis based on decision trees and the like, the method and the device classify and predict the relevant information of the user based on the prediction model of the support vector machine, can determine whether the user is likely to have set risk behaviors or not based on the prediction value output by the prediction model, and improve the prediction efficiency of user behavior prediction.

Description

User behavior prediction method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a method, an apparatus, a computer device, a storage medium, and a computer program product for predicting user behavior.
Background
When a user applies for using credit related services, corresponding risk assessment needs to be carried out on the user, so that the platform is helped to avoid risks. In the business risk assessment of the user, it is necessary to predict whether the user will have set risk behavior. At present, risk behavior prediction is usually performed on a user in a decision tree manner, however, analysis performed in the decision tree manner easily causes a model to be too complex, and further causes prediction efficiency to be reduced.
Therefore, the current method for predicting the risk behaviors of the user has the defect of low prediction efficiency.
Disclosure of Invention
In view of the above, it is necessary to provide a user behavior prediction method, apparatus, computer device, computer readable storage medium, and computer program product capable of improving prediction efficiency in view of the above technical problems.
In a first aspect, the present application provides a user behavior prediction method, including:
responding to a service application request sent by a user to be predicted, and acquiring user information in the service application request; the user information comprises user information of multiple dimensions of the user to be predicted; the multiple dimensions represent the application conditions of the service corresponding to the service application request to multiple types of the user to be predicted;
inputting the user information of the multiple dimensions into a pre-trained target support vector machine prediction model, mapping the user information of the multiple dimensions from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, and carrying out nonlinear classification based on the user information in the high-dimensional space to obtain a predicted value output by the target support vector machine prediction model based on a classification result of the nonlinear classification;
and determining a prediction result of the set risk behavior associated with the service of the user to be predicted according to the prediction value.
In one embodiment, the obtaining user information in a service application request in response to the service application request sent by a user to be predicted includes:
receiving a service application request sent by a user to be predicted, and acquiring associated asset information, associated service information and position information of the user to be predicted in the service application request as user information of the user to be predicted.
In one embodiment, before inputting the user information of the multiple dimensions into the pre-trained target support vector machine prediction model, the method further includes:
preprocessing the user information of the multiple dimensions, and converting the user information of the multiple dimensions into numerical user information of the multiple dimensions;
and inputting the numerical user information into a pre-trained target support vector machine prediction model.
In one embodiment, the converting the user information of the plurality of dimensions into the numerical user information of the plurality of dimensions includes:
determining a first tag numerical value corresponding to the associated asset information according to the asset quantity of the associated asset information; the magnitude of the first tag value is positively correlated with the quantity of assets;
determining a second label numerical value corresponding to the associated service information according to the risk level to which the associated service information belongs; the magnitude of the second tag value is positively correlated with the risk level;
determining a third label numerical value corresponding to the position information according to the regional economic index corresponding to the position information; the magnitude of the third label numerical value is positively correlated with the regional economic indicator;
and obtaining the numerical user information of the multiple dimensions according to at least two of the first label numerical value, the second label numerical value and the third label numerical value.
In one embodiment, the determining, by the target support vector machine prediction model, a result output based on the classification result of the non-linear classification is within a set numerical range, and the determining, according to the predicted value, a prediction result of the user to be predicted to have a set risk behavior associated with the service includes:
if the predicted value is smaller than or equal to a prediction threshold value, predicting that the user to be predicted does not have set risk behaviors;
if the predicted value is larger than a prediction threshold value, predicting that the user to be predicted has set risk behaviors;
wherein the prediction threshold is within the set range of values.
In one embodiment, the target support vector machine prediction model is trained by:
obtaining a plurality of sample user information, and dividing the plurality of sample user information into a training set and a test set; the sample user information comprises sample user information of multiple dimensions and corresponding real behavior states;
inputting multi-dimensional sample user information in the training set into a support vector machine prediction model to be trained, mapping the multi-dimensional sample user information to a multi-dimensional space through the support vector machine prediction model to be trained, and outputting a sample prediction value based on a classification result of the multi-dimensional space;
circularly executing the step of inputting the multi-dimensional data coordinates into the support vector machine prediction model to be trained until the sample setting risk behavior state corresponding to the sample prediction value is consistent with the real behavior state, and taking the current support vector machine prediction model to be trained as a support vector machine prediction model to be verified;
and inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining a target prediction model of the support vector machine according to the output result of the prediction model of the support vector machine to be verified.
In one embodiment, the method further comprises:
and constructing a prediction model of the support vector machine to be trained based on the C-type support vector classifier and the radial basis kernel function.
In one embodiment, the inputting the sample user information in the test set into the to-be-verified support vector machine prediction model, and obtaining the target support vector machine prediction model according to the output result of the to-be-verified support vector machine prediction model includes:
inputting the sample user information in the test set into the support vector machine prediction model to be verified, and acquiring a test behavior state corresponding to a test prediction value output by the support vector machine prediction model to be verified; obtaining a comparison result of the plurality of test behavior states and the corresponding real behavior states;
acquiring a first ratio of the number of consistent first sub-comparison results in the first comparison results to the number of the first comparison results; the first comparison result represents that the real behavior state is a comparison result of a behavior state without set risk;
acquiring a second ratio of the number of inconsistent second sub-comparison results in the second comparison result to the number of the second comparison results; the second comparison result represents that the real behavior state is a comparison result of a set risk behavior state;
constructing a receiver operating characteristic curve according to the first ratios and the second ratios;
and obtaining the area of a polygon formed by the receiver operating characteristic curve and the abscissa axis, and if the area is larger than a preset area threshold, determining the support vector machine prediction model to be verified as a target support vector machine prediction model.
In a second aspect, the present application provides a user behavior prediction apparatus, the apparatus comprising:
the system comprises a receiving module, a prediction module and a prediction module, wherein the receiving module is used for responding to a service application request sent by a user to be predicted and acquiring user information in the service application request; the user information comprises user information of multiple dimensions of the user to be predicted; the multiple dimensions represent the application conditions of the service corresponding to the service application request to multiple types of the user to be predicted;
the input module is used for inputting the user information of the multiple dimensions into a pre-trained target support vector machine prediction model, mapping the user information of the multiple dimensions from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, and carrying out nonlinear classification based on the user information in the high-dimensional space to obtain a predicted value output by the target support vector machine prediction model based on a classification result of the nonlinear classification; the number of dimensions of the high-dimensional space corresponds to the number of dimensions of the user information;
and the prediction module is used for determining the prediction result of the set risk behavior of the user to be predicted, which is associated with the service, according to the prediction value.
In a third aspect, the present application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method described above.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method described above.
According to the user behavior prediction method, the device, the computer equipment, the storage medium and the computer program product, the user information in the service application request is obtained by receiving the service application request sent by the user to be predicted, the user information of multiple dimensions is input into the pre-trained target support vector machine prediction model, the user information of multiple dimensions is mapped to the high-dimensional space corresponding to the dimensions of the user information from the low-dimensional space through the target support vector machine prediction model, nonlinear classification is carried out, the prediction value output by the target support vector machine prediction model based on the nonlinear classification result is obtained, and whether the set risk behavior associated with the service occurs to the user to be predicted is determined according to the prediction value. Compared with the traditional analysis based on decision trees and the like, the scheme classifies and predicts the relevant information of the user based on the prediction model of the support vector machine, so that whether the user possibly has the set risk behavior can be determined based on the predicted value output by the prediction model, and the prediction efficiency of user behavior prediction is improved.
Drawings
FIG. 1 is a diagram of an application environment of a method for predicting user behavior in one embodiment;
FIG. 2 is a flow diagram that illustrates a methodology for user behavior prediction in one embodiment;
FIG. 3 is a schematic flow chart diagram of the predictive model training step in one embodiment;
FIG. 4 is a schematic flow chart of the predictive model training step in another embodiment;
FIG. 5 is a diagram illustrating the model accuracy verification step in one embodiment;
FIG. 6 is a block diagram of an apparatus for predicting user behavior in one embodiment;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The user behavior prediction method provided by the embodiment of the application relates to the technical field of big data, and can be applied to the application environment shown in FIG. 1. The terminal 102 may communicate with a user side of a user, the user to be predicted may send a service application request to the terminal 102 through the user side, after receiving the service application request sent by the user to be predicted, the terminal 102 may obtain multi-dimensional user information of the user to be predicted from the service application request, where the user information may be information authorized by the user, the terminal 102 may input the multi-dimensional user information into a pre-trained target support vector machine prediction model, and predict whether the user may have a set risk behavior associated with a service through an output value of the target support vector machine prediction model. In some embodiments, a server 104 is also included. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 may store the prediction result of the user to be predicted in the server 104. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or a server cluster comprised of multiple servers.
In an embodiment, as shown in fig. 2, a user behavior prediction method is provided, which is described by taking the application of the method to the terminal in fig. 1 as an example, and includes the following steps:
step S202, responding to a service application request sent by a user to be predicted, and acquiring user information in the service application request; the user information comprises user information of multiple dimensions of a user to be predicted; and representing the application conditions of the service corresponding to the service application request to the multiple types of the user to be predicted by multiple dimensions.
The user to be predicted may be a user who needs to apply for a service, for example, a user who needs to apply for a loan service. When a user needs to apply for the service, an operation subject of the terminal 102 needs to predict a user to be predicted, and predict whether the user will have a set risk behavior associated with the service. The user can apply for service, such as loan service, through the user terminal. The user can fill in the relevant user information and form a service application request on the application page, and the service application request is sent to the terminal 102 through the user side. The user information filled by the user may be user information authorized to be used by the user. The terminal 102 may receive a service application request sent by a user to be predicted. And obtaining the user information of the user to be predicted from the service application request. The user information input by the user to be predicted may include user information of multiple dimensions, and each dimension may represent consideration of the strength of the user from different angles based on the service. For example, the service may be a loan service, and the terminal 102 may perform different considerations and considerations for the user to predict whether the user will be overdue after applying for a loan.
Step S204, inputting multi-dimensional user information into a pre-trained target support vector machine prediction model, mapping the multi-dimensional user information from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, and carrying out nonlinear classification based on the user information in the high-dimensional space to obtain a predicted value output by the target support vector machine prediction model based on a classification result of the nonlinear classification; the number of dimensions of the high-dimensional space corresponds to the number of dimensions of the user information.
The user information may be multi-dimensional user information sent to the terminal 102 when the user to be predicted needs to apply for the service. The terminal 102 may input the multi-dimensional user information into a pre-trained target support vector machine prediction model, map the multi-dimensional user information from a low-dimensional space to a high-dimensional space corresponding to the user information dimension through the target vector machine prediction model, and perform nonlinear classification. So that the terminal 102 can obtain the predicted value output by the target support vector machine prediction model based on the nonlinear classification result. The predicted value may be related to whether the user will have a set risk behavior associated with the service.
The terminal 102 may pre-construct a support vector machine prediction model to be trained, and train the support vector machine prediction model to be trained through multi-dimensional sample user information, for example, in one embodiment, the terminal 102 may further construct the support vector machine prediction model to be trained based on a C-type support vector classifier and a radial basis function, in this embodiment, the terminal 102 may select a specific SVM type and a kernel type of the SVM to construct the support vector machine prediction model to be trained, for example, the terminal 102 may construct the support vector machine prediction model to be trained based on the C-type support vector classifier and the radial basis function, the SVM model calculates empirical risks by using a chain comparison loss function and adds a regularization term in a solving system to optimize structural risks, and the SVM model is a classifier with sparsity and robustness. The SVM model has two important parameters, one is the type of the designated SVM, and the other is the kernel type of the SVM. Because the anti-interference capability of the C-type support vector classifier is the best, the terminal 102 may be modeled by using the C-type support vector classifier, and because the radial basis kernel function has the advantages of being capable of implementing nonlinear mapping, having more parameters, and the like, the terminal 102 may be modeled by using the radial basis kernel function. Wherein, the C-type support vector classifier: grouping n types (n > ═ 2), allowing incomplete classification with an outlier penalty factor C; a radial basis function is a scalar function that is symmetric along the radial direction, and is usually defined as a monotonic function of the euclidean distance from any point in space to a certain center x _ cxc. The further away this point is from the center, the smaller the value of the function.
And step S206, determining a prediction result of the set risk behavior associated with the service of the user to be predicted according to the prediction value.
The predicted value may be a numerical value output by the prediction model of the target support vector machine, and the terminal 102 may determine, according to the numerical value, a prediction result of the set risk behavior associated with the service of the user to be predicted. For example, in one embodiment, determining a prediction result of the set risk behavior of the user to be predicted according to the predicted value includes: if the predicted value is smaller than or equal to a prediction threshold value, predicting that the user to be predicted does not have set risk behaviors; if the predicted value is larger than a prediction threshold value, predicting that the user to be predicted has set risk behaviors; wherein the prediction threshold is within the set range of values. In this embodiment, the result output by the target support vector machine prediction model based on the classification result of the nonlinear classification is within a set numerical range. When analyzing the predicted value output by the target support vector machine prediction model, the terminal 102 may determine whether the user will have a risk setting behavior according to the magnitude of the predicted value. The terminal 102 may first set a prediction threshold value, and the prediction threshold value is within the set value range. The terminal 102 may detect whether the predicted value is less than or equal to the prediction threshold, and if so, the terminal 102 may predict that the user to be predicted does not have the set risk behavior. If the terminal 102 detects that the predicted value is greater than the prediction threshold, the terminal 102 may predict that the user to be predicted may have a set risk behavior.
The user information of the user may include user information of multiple dimensions, and the set value range may be [0,1]]In between. Taking the service applied by the user as the loan service, when the user needs to apply for a loan, the terminal 102 needs to predict whether the user will have overdue behavior. The terminal 102 may input user information for multiple dimensions of the user into the target support vector machine prediction model. For example, the terminal 102 constructs 15 variables into multidimensional variables, substitutes the multidimensional variables into the SVM model of the type C support vector classifier, maps the coordinates in the two-dimensional space into the multidimensional space, and classifies the multidimensional space. Let coordinate (X) of original data be assumeda,Xb,Xc,Xd,Xe,Xf,Xg,Xh,Xi,Xj,Xk,Xl,Xm,Xn) Mapping to a new multidimensional space, and classifying, wherein the output result range of the target support vector machine prediction model is [0,1]]And if the output value is less than 0.5, judging that the user with the overdue loan does not exist, and if the output value is more than 0.5, judging that the user with the overdue loan exists. The terminal 102 may thus determine whether the user applying for the loan is likely to be overdue and, based on the prediction, whether the user is allowed to make the loan.
According to the user behavior prediction method, user information in a service application request is obtained by receiving the service application request sent by a user to be predicted, the user information with multiple dimensions is input into a pre-trained target support vector machine prediction model, the user information with multiple dimensions is mapped from a low-dimensional space to a high-dimensional space corresponding to the dimensions of the user information through the target support vector machine prediction model, nonlinear classification is carried out, a predicted value output by the target support vector machine prediction model based on a nonlinear classification result is obtained, and whether the user to be predicted can have set risk behaviors associated with the service or not is determined according to the predicted value. Compared with the traditional analysis based on decision trees and the like, the scheme classifies and predicts the relevant information of the user based on the prediction model of the support vector machine, so that whether the user possibly has the set risk behavior can be determined based on the predicted value output by the prediction model, and the prediction efficiency of user behavior prediction is improved.
In one embodiment, in response to a service application request sent by a user to be predicted, acquiring user information in the service application request, including: and receiving a service application request sent by a user to be predicted, and acquiring associated asset information, associated service information and position information of the user to be predicted in the service application request as user information of the user to be predicted.
In this embodiment, the terminal 102 may obtain a service application request sent by a user to be predicted, and obtain multi-dimensional user information of the user to be predicted from the service application request. For example, after receiving a service application request sent by a user to be predicted, the terminal 102 may obtain associated asset information, associated service information, and location information of the user to be predicted in the service application request, as user information of the user to be predicted.
Specifically, for example, when the service is a loan, the user information of each dimension may be sent to the terminal 102 through the user side when the user applies for the loan. The terminal 102 may obtain the employment age, the borrower annual income level, the borrower annual income and the borrower house ownership status input by the user as the related asset information, and the dimension information is from the economic strength perspective of the user; the terminal 102 may further obtain information, such as a loan amount, a repayment period, a loan interest payment level, a loan interest level, a loan purpose adopted, a total debt and debt ratio of a borrower, a total amount of paid principal of the borrower, a recovery amount after default of the borrower, a payment amount in each period of the installment, and the like, which are input by the user, as the associated service information, wherein the loan amount, the repayment period, the loan interest payment level, and the loan interest level are adopted from the perspective of loan-related information; the loan purpose is adopted, the total debt and liability ratio of the borrower, the total amount of the paid principal of the borrower, the recovery amount after default of the borrower and the payment amount of each period of the periodical payment are based on the relationship between the borrower and the loan; the terminal 102 may further obtain the region where the borrower is located, which is input by the user, as the location information, where the dimensional information is from the perspective of the region where the user is located. The terminal 102 covers factors influencing loan overdue from the four aspects of economic strength, loan information, relationship between borrowers and loans and the region, and the model obtained based on the dimension training is relatively high in precision. The obtained user information of each dimension is information which is informed by the user and allowed to be obtained.
Through the embodiment, the terminal 102 can acquire the user information with the dimensions including the associated asset information, the associated service information, the position information and the like, so that the terminal 102 can predict the occurrence condition of the service risk behavior based on the user information with the dimensions, and the prediction efficiency of the user behavior prediction is improved.
In one embodiment, before inputting the user information of multiple dimensions into the pre-trained target support vector machine prediction model, the method further includes: preprocessing the user information of multiple dimensions, and converting the user information of multiple dimensions into numerical user information of multiple dimensions; and inputting the numerical user information into a pre-trained target support vector machine prediction model.
In this embodiment, since the target support vector machine prediction model has a format requirement on input data, after acquiring the user information of the user to be predicted in multiple dimensions, the terminal 102 needs to perform corresponding preprocessing on the user information and convert the user information into data conforming to the input format of the target support vector machine prediction model. For example, the terminal 102 may pre-process the user information of multiple dimensions, and convert the user information of multiple dimensions into numerical user information of multiple dimensions, so that the terminal 102 may input the numerical user information of the converted numerical form into a pre-trained target support vector machine prediction model. The target support vector machine prediction model may make predictions based on user information in the form of input values.
Through the embodiment, the terminal 102 can convert the multi-dimensional user information into the information of the numerical type conforming to the input format of the target support vector machine prediction model, so that the prediction efficiency of whether the set risk behavior occurs to the user to be predicted is improved.
In one embodiment, converting the multi-dimensional user information into the multi-dimensional numerical user information includes: determining a first tag numerical value corresponding to the associated asset information according to the asset quantity of the associated asset information; the size of the first label numerical value is positively correlated with the quantity of the assets; determining a second label numerical value corresponding to the associated service information according to the risk level to which the associated service information belongs; the magnitude of the second label value is positively correlated with the risk level; determining a third label numerical value corresponding to the position information according to the regional economic index corresponding to the position information; the size of the third label numerical value is positively correlated with the regional economic index; and obtaining the numerical user information of multiple dimensions according to at least two of the first label numerical value, the second label numerical value and the third label numerical value.
In this embodiment, the terminal 102 may convert the user information with multiple dimensions into corresponding numerical user information based on the dimension to which the user information belongs and the attribute of the user information itself. For the associated asset information, the terminal 102 may determine a first tag value corresponding to the associated asset information according to the asset number of the associated asset; the size of the first tag value is positively correlated with the asset quantity, so that the terminal 102 can obtain the numerical user information of multiple dimensions according to at least two of the first tag value, the second tag value and the third tag value. For example, for the associated asset information, the ownership status of the borrower's house is divided into three cases, namely private, lease and mortgage, and the terminal 102 may convert the private into 0, the lease into 1 and the mortgage into 2; the annual income level of the borrower is divided into three conditions, namely low income, medium income and high income, and the terminal 102 can convert the low income into 0, the medium income into 1 and the high income into 2.
For the associated service information, the terminal 102 may determine a second tag value corresponding to the associated service information according to the risk level to which the associated service information belongs, where the size of the second tag value is positively correlated with the risk level. For example, regarding the associated service information, taking loan service as an example, the repayment deadline is divided into two cases, a three-year deadline and a five-year deadline, and the terminal 102 may convert the three-year deadline into 0 and convert the five-year deadline into 1. Loan objectives include 14-minute situations including repayment of credit cards, car purchases, debt payments, education, home improvement, house purchases, major purchases, medical expenditures, travel, others, renewable energy, small businesses, vacations, and wedding. The terminal 102 may convert the repayment credit card into 0, the shopping cart into 1, the repayment debt into 2, the education into 3, the family improvement into 4, the house purchase into 5, the major purchase into 6, the medical expenditure into 7, the trip into 8, the others into 9, the renewable energy into 10, the small business into 11, the vacation into 12, and the wedding into 13. The loan interest payment level is divided into two cases, low interest and high interest, and the terminal 102 may convert the low interest to 1 and the high interest to 2. The loan interest level is divided into seven grades, from A to G, respectively, and the terminal 102 may convert A to 0, B to 1, C to 2, D to 3, E to 4, F to 5, and G to 6.
For the location information, the terminal 102 may determine a third tag value corresponding to the location information according to the area economic indicator corresponding to the location information; wherein the magnitude of the third label value is positively correlated with the regional economic indicator. For example, for the user's location information, taking the Everest loan dataset of the peer-to-peer lending bank of Ireland as an example, the region is divided into five places, cannught (Comorter, Ireland), leinster (Ensterl, Ireland), munster (Manster, Ireland), Northern-lrl (North Ireland), and ulser (Alsterl, Ireland), the terminal 102 may convert cannught to 0, leinster to 1, munster to 2, Northern-lrl to 3, and ulster to 4.
Therefore, the terminal 102 may input each of the first tag value, the second tag value, and the third tag value into the target support vector machine prediction model to obtain the predicted value output by the target support vector machine prediction model. For example, in the case of loan transaction, the predicted value output by the terminal 102 represents two loan statuses, one being a non-overdue user and one being an overdue user. The terminal 102 may represent a non-overdue user as 0 and an overdue user as 1.
Through the embodiment, the terminal 102 can convert the user information into corresponding numerical value user information based on the difference between the dimensionality and the attribute of the user information, and the prediction is carried out through the target support vector machine prediction model, so that the prediction efficiency of user behavior prediction is improved.
In one embodiment, the target support vector machine prediction model is trained by: obtaining a plurality of sample user information, and dividing the plurality of sample user information into a training set and a test set; the sample user information comprises sample user information of multiple dimensions and corresponding real behavior states; inputting multi-dimensional sample user information in a training set into a support vector machine prediction model to be trained, mapping the multi-dimensional sample user information to a multi-dimensional space through the support vector machine prediction model to be trained, and outputting a sample prediction value based on a classification result of the multi-dimensional space; circularly executing the step of inputting the multi-dimensional data coordinates into the support vector machine prediction model to be trained until the set risk behavior state of the sample corresponding to the sample prediction value is consistent with the real behavior state, and taking the current support vector machine prediction model to be trained as the support vector machine prediction model to be verified; and inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining the prediction model of the target support vector machine according to the output result of the prediction model of the support vector machine to be verified.
In this embodiment, the terminal 102 may train the support vector machine prediction model to be trained in advance to obtain the target support vector machine prediction model. Before training, the terminal 102 may obtain a plurality of sample user information and segment the plurality of sample user information into a training set and a test set. The sample user information includes sample user information of multiple dimensions and corresponding real behavior states, and the real behavior state may be a real state of a set risk behavior related to whether a service occurs or not corresponding to a user of the sample user information. For example, as shown in FIG. 3, FIG. 3 is a flow chart illustrating the training steps of the predictive model in one embodiment. Taking the applied business as the loan business for example, the terminal 102 may adopt the overdue loan data set of the peer-to-peer loan bank of ireland, first perform data preprocessing on the data set, divide the data set into a training set and a test set in a ratio of 4 to 1, then, an SVM model is adopted for modeling, model accuracy is calculated by methods such as a confusion matrix and AUC, the data set of the overdue loan of the equal lending bank of Ireland comprises the employment age, the ownership status of the borrower's house, the annual income level of the borrower, the annual income of the borrower, the loan amount, the repayment term, the loan purpose, the loan interest payment level, the loan interest level, the total debt and debt ratio of the borrower, the total amount of the borrower paid the interest, the borrower paid the principal, the amount of the borrower after default, the payment amount of each term of the installment, the area where the borrower is located and the loan state. The information contained in the data set comprises employment information, user information, loan information and the like, basically covers all directions influencing the loan state, and the data set is used for predicting whether the loan of the bank user is overdue or not with higher precision.
The terminal 102 may input the multi-dimensional sample user information in the training set into the prediction model of the support vector machine to be trained, map the multi-dimensional sample user information to the multidimensional space through the prediction model of the support vector machine to be trained, and output a sample prediction value based on the classification result of the multidimensional space. The terminal 102 may execute the above training steps in a loop, and each loop may output a corresponding sample prediction value until the terminal 102 detects that a sample set risk behavior state corresponding to the sample prediction value is consistent with the real behavior state, and the terminal 102 may use the current support vector machine prediction model to be trained as the support vector machine prediction model to be verified. The terminal 102 may further input the sample user information in the test set into the prediction model of the support vector machine to be verified, obtain a plurality of prediction results output by the prediction model of the support vector machine to be verified based on the test set data, and the terminal 102 may perform relevant verification on the output results of the prediction model of the support vector machine to be verified, and determine the prediction model of the support vector machine to be verified as the target prediction model when the verification is passed.
For example, to verify the effect of the model, the terminal 102 may divide the data set into a training set and a test set, with a weight division of 4:1 for the training set and the test set. The purpose of dividing the data set is to firstly train the model by using the training set, then apply the trained model to the test set, and calculate the precision and AUC value of the model so as to judge the quality of the model. If the calculated model precision is more than 80% and the AUC value is more than 70%, the model is preliminarily judged to have use value. The AUC (Area Under the Curve) is defined as the Area enclosed by the coordinate axis Under the ROC (receiver operating characteristic Curve), and obviously, the value of this Area is not greater than 1, and since the ROC Curve is generally located above the line y ═ x, the value of the AUC ranges from 0.5 to 1. The closer the AUC is to 0.5. The detection method has the lowest authenticity and no application value. The SVM model is also called a support vector machine, is one of the most robust and accurate methods in all known data mining algorithms, belongs to a binary classification algorithm, and can support linear and nonlinear classification. The support vector machine is a machine learning algorithm for solving the problem of two classifications, and separates samples of different classes by finding a partition hyperplane in a sample space, and simultaneously, the minimum distance between two point sets to the plane is maximum, and the distance between an edge point in the two point sets to the plane is maximum.
Because the data of the scheme is multidimensional data, a nonlinear support vector machine is adopted to predict whether the loan user is overdue. The low-dimensional data can be observed whether the data can be linearly separable or not in a visual mode, the high-dimensional data is difficult to judge whether the data can be linearly separable or not, and the linear support vector machine is forcibly constructed on the linearly inseparable data, so that poor prediction effect is caused, and therefore the high-dimensional data is generally modeled in a non-linear support vector machine mode. Taking the business as an example of loan, the terminal 102 may select the loan status dependent variable, the employment age, the housing ownership status of the borrower, the annual income level of the borrower, the annual income of the borrower, the loan amount, the repayment term, the loan purpose, the loan interest payment level, the loan interest level, the total debt/debt ratio of the borrower, the total amount of the borrower paid the interest, the loan amount of the borrower after the default, the installment amount of the borrower in each period of the period payment, and the area where the borrower is located as independent variables. And adopting the training set as a data set of a training model, selecting a C-type support vector classifier as the type of the SVM model, selecting a radial basis kernel function as the kernel type of the SVM, and establishing a corresponding SVM model as a support vector machine prediction model to be trained. In the training process, the support vector machine prediction model to be trained needs to map input multi-dimensional user information into a multi-dimensional space, that is, the terminal 102 may map each coordinate in the high-dimensional data from a low-dimensional space to a high-dimensional space in the support vector machine prediction model, so as to obtain an equation corresponding to each dimension of the high-dimensional data in the high-dimensional space, and an output of each equation may be between [0,1], and the terminal 102 may perform nonlinear classification on the outputs. The mapping formula can be as follows:
Figure BDA0003506670270000141
wherein, 0 is more than or equal to lambdai≤C,Φ(xi) The process of mapping the low-dimensional space coordinates of the data points to the high-dimensional space coordinates is represented, so that a classification hyperplane can be constructed to realize classification, and before the terminal 102 predicts the data, the coordinates in the two-dimensional space are mapped to the multi-dimensional space and then classified. For example, taking a circle as an example, in two-dimensional space, the equation of the circle is: a0+ a1X1+a2X1 2+a3X2+a4X2 2+a5X1X2Mapping data from a two-dimensional space to a five-dimensional space, establishing the five-dimensional space, and defining each coordinate axis of the five-dimensional space as follows: a. the1:X1、A2:X1 2、A3:X2、A4:X1 2、A5:X1X2Then, in the new five-dimensional space, the equation of the circle is: a0+ a1A1+a2A2+a3A3+a4A4+a5A5When the coordinate (X) of the original data is equal to 0, the hyperplane equation is obtainedi,Xj) Also mapped into the new five-dimensional space, then becomes a linearly separable state. After the terminal 102 realizes model training through the classification, the support vector machine prediction model to be verified can be verified through the confusion matrix and the AUC value.
By the embodiment, the terminal 102 can implement training of the prediction model by low-dimensional to high-dimensional mapping and classification based on the multi-dimensional sample user information, so that the prediction efficiency of predicting whether the set risk behavior of the user occurs is improved.
In one embodiment, inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining the prediction model of the target support vector machine according to the output result of the prediction model of the support vector machine to be verified, includes: inputting the sample user information in the test set into a support vector machine prediction model to be verified, and acquiring a test behavior state corresponding to a test prediction value output by the support vector machine prediction model to be verified; obtaining comparison results of a plurality of test behavior states and corresponding real behavior states; acquiring a first ratio of the number of the first sub-comparison results which are consistent in the first comparison results to the number of the first comparison results; the first comparison result represents that the real behavior state is a comparison result of a behavior state without set risk; acquiring a second ratio of the number of inconsistent second sub-comparison results in the second comparison result to the number of the second comparison results; the second comparison result represents that the real behavior state is a comparison result of a set risk behavior state; constructing a receiver operating characteristic curve according to the first ratios and the second ratios; and obtaining the area of a polygon formed by the receiver operating characteristic curve and the abscissa axis, and if the area is larger than a preset area threshold, determining the support vector machine prediction model to be verified as a target support vector machine prediction model.
In this embodiment, the terminal 102 may verify the prediction model of the support vector machine to be verified, which is obtained through training of the training set. The terminal 102 may output the information of the sample user in the divided test set to the support vector machine prediction model to be verified, and obtain a test behavior state corresponding to a test prediction value output by the support vector machine prediction model to be verified, and the terminal 102 may obtain a plurality of test behavior states, compare the test behavior states with corresponding real behavior states, and obtain a plurality of comparison results. The terminal 102 may obtain a first ratio of the number of the first sub-comparison results in the first comparison result that are consistent to the number of the first comparison results. And the first comparison result represents that the real behavior state is the comparison result of the behavior state without the set risk. The terminal 102 may further obtain a second ratio of the number of the second sub-comparison results that are inconsistent in the second comparison result to the number of the second comparison results. And the second comparison result represents that the real behavior state is the comparison result of the behavior state with the set risk. The terminal 102 may construct a receiver operation characteristic curve, that is, an ROC curve, according to the first ratios and the second ratios, obtain an area of a polygon formed by the receiver operation characteristic curve and an abscissa axis, and determine that the to-be-verified support vector machine prediction model is the target support vector machine prediction model if the area detected by the terminal 102 is greater than a preset area threshold.
The confusion matrix, also called an error matrix, is a standard format for representing accuracy evaluation, and is represented by a matrix form of n rows and n columns. Specific evaluation indexes include overall accuracy, drawing accuracy, user accuracy and the like, and the accuracy indexes reflect the accuracy of image classification from different sides. In artificial intelligence, the confusion matrix is a visualization tool, is particularly used for supervised learning, can clearly determine the precision of the model and increase the persuasion of the model. Taking the business as an example of loan, the terminal 102 may substitute the data of the test set into the model of the training set for prediction, and compare the model with the case that whether the actual bank user is overdue, and establish a confusion matrix, where the table is as follows:
whether the bank user is overdue 0 1
0 1687 1
1 63 248
From the table, the model accuracy of the support vector machine prediction model to be verified is 96.75%. The accuracy of correctly predicting the overdue of the client reaches 79.74%, the probability of predicting the failure of the client without overdue is 99.94%, and the overall accuracy is high.
The terminal 102 can also predict the support vector machine prediction model to be verified through the AUC values. Wherein, if the AUC value is more than 0.7, the model has better effect and practical significance. As shown in fig. 4, fig. 4 is a schematic diagram of a model accuracy verification step in an embodiment. The terminal 102 may plot a coordinate system with tpr as the ordinate and fpr as the abscissa, plot an ROC curve, and obtain an AUC value based on the curve. Specifically, for tpr and fpr, TP is a true example, i.e., the predicted value is the same as the true value, and both are positive examples. FP is a false positive case, i.e. the predicted value is opposite to the true value, and the predicted value is P. FN is a false negative case, i.e. the predicted value is opposite to the true value, the predicted value is N. TN is a True negative case, i.e. the predicted value is the same as the True value, and is a reverse case, thus introducing the concept of True Rate (True Positive Rate) and False Positive Rate (False Positive Rate): TPR: the proportion TPR of positive samples classified as positive TPR + FNTPR ═ TP }/{ TP + FN }. FPR: the proportion of negative samples classified as positive class FPR ═ { FP }/{ FP + TN }. As can be seen from fig. 4, AUC is 0.8984, which is greater than 0.7, and it is shown that the above-mentioned support vector machine prediction model to be verified has a better effect, and can be used as the target support vector machine prediction model.
Through the embodiment, the terminal 102 can verify the support vector machine prediction model in an AUC value mode, so that the prediction efficiency of user behavior prediction can be improved.
In one embodiment, as shown in FIG. 5, FIG. 4 is a flow chart illustrating the training step of the predictive model in another embodiment. Taking the business as an example of bank loan, the terminal 102 may predict whether the user will have loan prediction based on the user information of different customers. Two existing customers, namely the employment age limit of the first customer, the housing ownership condition of the borrower, the annual income level of the borrower, the annual income of the borrower, the loan amount, the repayment period, the loan purpose, the loan interest payment level, the loan interest level, the total debt and liability ratio of the borrower, the total paid interest amount of the borrower, the paid principal amount of the borrower, the recycling amount of the borrower after default, the payment amount of each term paid by term and the area where the borrower is located are respectively equal to 10 years, lease, low, 24000, 5000, 36 months, credit card, low, B, 27.65, 5861.071, 5000, 0, Mister province of Ireland, and the obtained result is 0, namely the non-overdue customer. The employment age of the second client, the housing ownership status of the borrower, the annual income level of the borrower, the annual income of the borrower, the loan amount, the repayment period, the loan purpose, the loan interest payment level, the loan interest level, the total debt ratio of the borrower, the total paid interest amount of the borrower, the paid principal amount of the borrower, the returned amount of the borrower after default, the paid amount of each term and the area of the borrower which are paid by stages are respectively equal to 0.5 year, lease, low, 30000, 2500, 60 months, buy car, high, C, 1, 1008.71, 456.46, 117.08 and the ethical province of Ireland, and the obtained result is 1, namely the overdue user.
Through the embodiment, the terminal 102 can predict whether the user will have loan overdue through the target support vector machine prediction model, so that the prediction efficiency of user behavior prediction is improved. In addition, the terminal 102 firstly carries out data preprocessing on the data set by adopting an overdue loan data set of an equal loan bank of Ireland, adopts an SVM model to carry out modeling on a training set, and uses a test set to verify the model accuracy. The terminal 102 can also obtain an AUC value equal to 0,8984 and greater than 0.7 through calculation, so that the model precision is high and the model has practical significance. The terminal 102 can accurately judge whether the expected behavior of a new user exists in the future through the target SVM prediction model. For users who may have overdue behaviors, the fund risk is avoided by reducing the loan amount. For the users without overdue behavior, the total loan amount of the bank is increased by properly increasing the loan amount, so that the bank profits are increased.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the application also provides a user behavior prediction device for implementing the user behavior prediction method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so the specific limitations in one or more embodiments of the user behavior prediction device provided below may refer to the limitations on the user behavior prediction method in the above, and details are not repeated here.
In one embodiment, as shown in fig. 6, there is provided a user behavior prediction apparatus including: a receiving module 500, an input module 502, and a prediction module 504, wherein:
the receiving module 500 is configured to respond to a service application request sent by a user to be predicted, and acquire user information in the service application request; the user information comprises user information of multiple dimensions of a user to be predicted; and representing multiple application conditions of multiple types of users to be predicted by the service corresponding to the service application request by multiple dimensions.
The input module 502 is configured to input multi-dimensional user information into a pre-trained target support vector machine prediction model, map the multi-dimensional user information from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, perform nonlinear classification based on the user information in the high-dimensional space, and obtain a prediction value output by the target support vector machine prediction model based on a classification result of the nonlinear classification; the number of dimensions of the high-dimensional space corresponds to the number of dimensions of the user information.
And the prediction module 504 is configured to determine a prediction result of the set risk behavior associated with the service of the user to be predicted according to the prediction value.
In an embodiment, the receiving module 500 is specifically configured to receive a service application request sent by a user to be predicted, and obtain associated asset information, associated service information, and location information of the user to be predicted in the service application request, as user information of the user to be predicted.
In one embodiment, the above apparatus further comprises: the preprocessing module is used for preprocessing the user information with multiple dimensions and converting the user information with multiple dimensions into numerical user information with multiple dimensions; and inputting the numerical user information into a pre-trained target support vector machine prediction model.
In an embodiment, the preprocessing module is specifically configured to determine, according to the asset quantity of the associated asset information, a first tag numerical value corresponding to the associated asset information; the size of the first label numerical value is positively correlated with the quantity of the assets; determining a second label numerical value corresponding to the associated service information according to the risk level to which the associated service information belongs; the magnitude of the second label value is positively correlated with the risk level; determining a third label numerical value corresponding to the position information according to the regional economic index corresponding to the position information; the size of the third label numerical value is positively correlated with the regional economic index; and obtaining numerical user information of multiple dimensions according to at least two of the first label numerical value, the second label numerical value and the third label numerical value.
In an embodiment, the prediction module 504 is specifically configured to predict that the user to be predicted does not have the set risk behavior if the predicted value is less than or equal to the prediction threshold; if the predicted value is larger than the prediction threshold, predicting the occurrence of set risk behaviors of the user to be predicted; wherein the prediction threshold is within a set numerical range.
In one embodiment, the above apparatus further comprises: the training module is used for acquiring a plurality of sample user information and dividing the plurality of sample user information into a training set and a testing set; the sample user information comprises sample user information of multiple dimensions and corresponding real behavior states; inputting multi-dimensional sample user information in a training set into a support vector machine prediction model to be trained, mapping the multi-dimensional sample user information to a multi-dimensional space through the support vector machine prediction model to be trained, and outputting a sample prediction value based on a classification result of the multi-dimensional space; circularly executing the step of inputting the multi-dimensional data coordinates into the support vector machine prediction model to be trained until the set risk behavior state of the sample corresponding to the sample prediction value is consistent with the real behavior state, and taking the current support vector machine prediction model to be trained as the support vector machine prediction model to be verified; and inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining the prediction model of the target support vector machine according to the output result of the prediction model of the support vector machine to be verified.
In one embodiment, the above apparatus further comprises: and the building module is used for building a prediction model of the support vector machine to be trained on the basis of the C-type support vector classifier and the radial basis kernel function.
In an embodiment, the training module is specifically configured to input sample user information in a test set into a to-be-verified support vector machine prediction model, and obtain a test behavior state corresponding to a test prediction value output by the to-be-verified support vector machine prediction model; obtaining comparison results of a plurality of test behavior states and corresponding real behavior states; acquiring a first ratio of the number of the first sub-comparison results which are consistent in the first comparison results to the number of the first comparison results; the first comparison result represents that the real behavior state is a comparison result of a behavior state without set risk; acquiring a second ratio of the number of inconsistent second sub-comparison results in the second comparison result to the number of the second comparison results; the second comparison result represents that the real behavior state is a comparison result of a set risk behavior state; constructing a receiver operating characteristic curve according to the first ratios and the second ratios; and obtaining the area of a polygon formed by the receiver operating characteristic curve and the abscissa axis, and if the area is larger than a preset area threshold, determining the support vector machine prediction model to be verified as a target support vector machine prediction model.
The modules in the user behavior prediction device may be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 7. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for communicating with an external terminal in a wired or wireless manner, and the wireless manner can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a method of user behavior prediction. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the user behavior prediction method when executing the computer program.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the above-mentioned user behavior prediction method.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the user behavior prediction method described above.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include a Read-Only Memory (ROM), a magnetic tape, a floppy disk, a flash Memory, an optical Memory, a high-density embedded nonvolatile Memory, a resistive Random Access Memory (ReRAM), a Magnetic Random Access Memory (MRAM), a Ferroelectric Random Access Memory (FRAM), a Phase Change Memory (PCM), a graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (12)

1. A method for predicting user behavior, the method comprising:
responding to a service application request sent by a user to be predicted, and acquiring user information in the service application request; the user information comprises user information of multiple dimensions of the user to be predicted; the multiple dimensions represent the application conditions of the service corresponding to the service application request to multiple types of the user to be predicted;
inputting the user information of the multiple dimensions into a pre-trained target support vector machine prediction model, mapping the user information of the multiple dimensions from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, and carrying out nonlinear classification based on the user information in the high-dimensional space to obtain a predicted value output by the target support vector machine prediction model based on a classification result of the nonlinear classification; the number of dimensions of the high-dimensional space corresponds to the number of dimensions of the user information;
and determining a prediction result of the set risk behavior associated with the service of the user to be predicted according to the prediction value.
2. The method according to claim 1, wherein the obtaining user information in the service application request in response to the service application request sent by the user to be predicted comprises:
receiving a service application request sent by a user to be predicted, and acquiring associated asset information, associated service information and position information of the user to be predicted in the service application request as user information of the user to be predicted.
3. The method of claim 2, wherein before inputting the user information of the plurality of dimensions into a pre-trained target support vector machine prediction model, further comprising:
preprocessing the user information of the multiple dimensions, and converting the user information of the multiple dimensions into numerical user information of the multiple dimensions;
and inputting the numerical user information into a pre-trained target support vector machine prediction model.
4. The method of claim 3, wherein converting the multi-dimensional user information into multi-dimensional numerical user information comprises:
determining a first tag numerical value corresponding to the associated asset information according to the asset quantity of the associated asset information; the magnitude of the first tag value is positively correlated with the quantity of assets;
determining a second label numerical value corresponding to the associated service information according to the risk level to which the associated service information belongs; the magnitude of the second tag value positively correlates with the risk level;
determining a third label numerical value corresponding to the position information according to the regional economic index corresponding to the position information; the magnitude of the third label numerical value is positively correlated with the regional economic indicator;
and obtaining the numerical user information of the multiple dimensions according to at least two of the first label numerical value, the second label numerical value and the third label numerical value.
5. The method of claim 1, wherein the target support vector machine prediction model outputs a result based on the classification result of the non-linear classification within a set range of values;
the determining a prediction result of the set risk behavior associated with the service of the user to be predicted according to the prediction value comprises:
if the predicted value is smaller than or equal to a prediction threshold value, predicting that the user to be predicted does not have set risk behaviors;
if the predicted value is larger than a prediction threshold value, predicting that the user to be predicted has set risk behaviors;
wherein the prediction threshold is within the set range of values.
6. The method of claim 1, wherein the target support vector machine prediction model is trained by:
obtaining a plurality of sample user information, and dividing the plurality of sample user information into a training set and a test set; the sample user information comprises sample user information of multiple dimensions and corresponding real behavior states;
inputting multi-dimensional sample user information in the training set into a support vector machine prediction model to be trained, mapping the multi-dimensional sample user information to a multi-dimensional space through the support vector machine prediction model to be trained, and outputting a sample prediction value based on a classification result of the multi-dimensional space;
circularly executing the step of inputting the multi-dimensional data coordinates into the support vector machine prediction model to be trained until the sample setting risk behavior state corresponding to the sample prediction value is consistent with the real behavior state, and taking the current support vector machine prediction model to be trained as a support vector machine prediction model to be verified;
and inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining a target prediction model of the support vector machine according to the output result of the prediction model of the support vector machine to be verified.
7. The method of claim 1, further comprising:
and constructing a prediction model of the support vector machine to be trained based on the C-type support vector classifier and the radial basis kernel function.
8. The method according to claim 6, wherein the inputting the sample user information in the test set into the prediction model of the support vector machine to be verified, and obtaining the prediction model of the target support vector machine according to the output result of the prediction model of the support vector machine to be verified comprises:
inputting the sample user information in the test set into the support vector machine prediction model to be verified, and acquiring a test behavior state corresponding to a test prediction value output by the support vector machine prediction model to be verified; obtaining a comparison result of the plurality of test behavior states and the corresponding real behavior states;
acquiring a first ratio of the number of consistent first sub-comparison results in the first comparison results to the number of the first comparison results; the first comparison result represents that the real behavior state is a comparison result of a behavior state without set risk;
acquiring a second ratio of the number of inconsistent second sub-comparison results in the second comparison result to the number of the second comparison results; the second comparison result represents that the real behavior state is a comparison result of a set risk behavior state;
constructing a receiver operating characteristic curve according to the first ratios and the second ratios;
and obtaining the area of a polygon formed by the receiver operating characteristic curve and the abscissa axis, and if the area is larger than a preset area threshold, determining the support vector machine prediction model to be verified as a target support vector machine prediction model.
9. A user behavior prediction apparatus, the apparatus comprising:
the system comprises a receiving module, a prediction module and a prediction module, wherein the receiving module is used for responding to a service application request sent by a user to be predicted and acquiring user information in the service application request; the user information comprises user information of multiple dimensions of the user to be predicted; the multiple dimensions represent the application conditions of the service corresponding to the service application request to multiple types of the user to be predicted;
the input module is used for inputting the user information of the multiple dimensions into a pre-trained target support vector machine prediction model, mapping the user information of the multiple dimensions from a low-dimensional space to a high-dimensional space through the target support vector machine prediction model, and carrying out nonlinear classification based on the user information in the high-dimensional space to obtain a predicted value output by the target support vector machine prediction model based on a classification result of the nonlinear classification; the number of dimensions of the high-dimensional space corresponds to the number of dimensions of the user information;
and the prediction module is used for determining the prediction result of the set risk behavior associated with the business of the user to be predicted according to the prediction value.
10. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 8.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
12. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 8 when executed by a processor.
CN202210141838.7A 2022-02-16 2022-02-16 User behavior prediction method and device, computer equipment and storage medium Pending CN114549174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210141838.7A CN114549174A (en) 2022-02-16 2022-02-16 User behavior prediction method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210141838.7A CN114549174A (en) 2022-02-16 2022-02-16 User behavior prediction method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114549174A true CN114549174A (en) 2022-05-27

Family

ID=81675955

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210141838.7A Pending CN114549174A (en) 2022-02-16 2022-02-16 User behavior prediction method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114549174A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216803A (en) * 2023-11-09 2023-12-12 成都乐超人科技有限公司 Intelligent finance-oriented user information protection method and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216803A (en) * 2023-11-09 2023-12-12 成都乐超人科技有限公司 Intelligent finance-oriented user information protection method and system
CN117216803B (en) * 2023-11-09 2024-02-09 成都乐超人科技有限公司 Intelligent finance-oriented user information protection method and system

Similar Documents

Publication Publication Date Title
TWI686760B (en) Data processing method, device, equipment and server for insurance fraud identification
Yeh et al. A hybrid KMV model, random forests and rough set theory approach for credit rating
Chaudhuri et al. Fuzzy support vector machine for bankruptcy prediction
CN110751557B (en) Abnormal fund transaction behavior analysis method and system based on sequence model
US20150269669A1 (en) Loan risk assessment using cluster-based classification for diagnostics
Chen Visualization and dynamic evaluation model of corporate financial structure with self-organizing map and support vector regression
CN113095408A (en) Risk determination method and device and server
Wu et al. Application analysis of credit scoring of financial institutions based on machine learning model
Zhu et al. Explainable prediction of loan default based on machine learning models
CN114549174A (en) User behavior prediction method and device, computer equipment and storage medium
CN113435900A (en) Transaction risk determination method and device and server
CN116894721A (en) Index prediction method and device and computer equipment
CN115063143A (en) Account data processing method and device, computer equipment and storage medium
CN115860802A (en) Product value prediction method, device, computer equipment and storage medium
CN115186890A (en) Early warning method, early warning device, computer equipment and storage medium
CN115471260A (en) Neural network-based sales prediction method, apparatus, device and medium
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
Khiem Tran et al. Towards Improved Bankruptcy Prediction: Utilizing Variational Autoencoder Latent Representations in a Norwegian Context
CN114529399A (en) User data processing method, device, computer equipment and storage medium
CN114219184A (en) Product transaction data prediction method, device, equipment, medium and program product
Caplescu et al. Will they repay their debt? Identification of borrowers likely to be charged off
CN117575772A (en) Abnormal user detection method and device, computer equipment and storage medium
CN115797048A (en) Account risk identification method and device, computer equipment and storage medium
CN113989012A (en) Method, device, medium and equipment for classifying borrowing object crowd of bad assets
CN116109406A (en) Method, device, computer equipment and storage medium for predicting lending willingness of user

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination