CN114529399A - User data processing method, device, computer equipment and storage medium - Google Patents

User data processing method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN114529399A
CN114529399A CN202210152732.7A CN202210152732A CN114529399A CN 114529399 A CN114529399 A CN 114529399A CN 202210152732 A CN202210152732 A CN 202210152732A CN 114529399 A CN114529399 A CN 114529399A
Authority
CN
China
Prior art keywords
data
static
user data
model
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210152732.7A
Other languages
Chinese (zh)
Inventor
薛雨杉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210152732.7A priority Critical patent/CN114529399A/en
Publication of CN114529399A publication Critical patent/CN114529399A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Abstract

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for processing user data, a computer device, and a storage medium. The method comprises the following steps: acquiring user data to be processed, wherein the user data to be processed comprises initial user data at different time points; mapping the initial user data of each time point to obtain a plurality of static knowledge maps; respectively extracting the features of the static knowledge graphs to obtain static features corresponding to the static knowledge graphs, wherein the static features are used for representing and fusing feature data of each node feature in the static knowledge graphs; connecting the static characteristics corresponding to each time point in series to obtain dynamic characteristics; and obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics. By adopting the method, the user data to be processed can be accurately evaluated.

Description

User data processing method, device, computer equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for processing user data, a computer device, a storage medium, and a computer program product.
Background
Loan is a credit activity that brings money in a certain interest rate and a certain term, and is the most important item in commercial banking. Where the profit margin of the loan has a direct relationship with the loan price, in particular where the loan price is high and the profit is high, the demand for the loan will be reduced accordingly. Conversely, the loan price is low, the profit is low, but the loan demand will increase. Accordingly, it is desirable to establish corresponding loan rates for different lenders.
In the related art, for example, in the 40 s of the 20 th century, some banks in the united states have begun to try to study credit scoring methods for rapidly processing large numbers of credit applications; in 1956, engineers BillFair and mathematician earl isaac invented a famous FICO scoring method together, and the method takes a logistic regression method as a technical core and is the most mature credit risk scoring model applied in the industry at present; in the 60-80 s of the 20 th century, with the progress of information technology and the rapid development of services, credit scoring models have been widely applied to credit cards, consumer credit, home mortgage loans and small-business loans.
However, these methods have many limitations in the big data era of data explosion, resulting in low efficiency and inaccurate evaluation results.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a user data processing method, an apparatus, a computer device and a storage medium capable of accurately evaluating user data to be processed.
In a first aspect, the present application provides a user data processing method, including:
acquiring user data to be processed, wherein the user data to be processed comprises initial user data at different time points;
performing map construction on initial user data of each time point to obtain a plurality of static knowledge maps;
respectively extracting the features of the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps, wherein the static features are used for representing and fusing feature data of each node feature in the static knowledge maps;
connecting the static characteristics corresponding to each time point in series to obtain dynamic characteristics;
and obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics.
In one embodiment, the mapping the initial user data at each time point to obtain a plurality of static knowledge maps includes:
extracting initial user data of each time point to obtain triple data;
and carrying out map construction based on the triple data to obtain a plurality of static knowledge maps.
In one embodiment, after performing graph construction based on triple data to obtain a plurality of static knowledge graphs, the method includes:
and when the nodes in the static knowledge graph have label loss, reading the ternary group data where the label loss nodes are positioned, and supplementing the label loss nodes according to the labels of the ternary group data.
In one embodiment, the above respectively performing feature extraction on the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps includes:
obtaining a characteristic matrix corresponding to the plurality of static knowledge maps according to the plurality of static knowledge maps;
and performing feature fusion on each feature matrix to obtain static features corresponding to the plurality of static knowledge maps.
In one embodiment, the obtaining target feature data corresponding to the user data to be processed according to the dynamic feature includes:
carrying out feature extraction on the dynamic features to obtain retained features;
updating the retained features to obtain updated features;
and calculating to obtain target characteristic data according to the updated characteristics.
In one embodiment, the above-mentioned respectively performing feature extraction on the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps is realized by a pre-trained first model;
and obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics is realized through a pre-trained second model.
In one embodiment, the training process of the first model and the second model includes:
acquiring sample data, wherein the sample data carries marked data and a characteristic label;
inputting the sample data into the first model, and extracting the sample data through the first model to obtain sample characteristics;
and calculating a first target loss function according to the sample characteristics and the characteristic labels in the sample data, wherein the first loss function is used for optimizing the first model until the first model completes training.
Inputting the sample characteristics into a second model to predict the sample characteristics through the second model to obtain sample target characteristics of the sample characteristics;
and calculating to obtain a second target loss function according to the sample target characteristics and the labeling data, wherein the second loss function is used for optimizing the second model until the second model is trained.
In a second aspect, the present application further provides a risk processing method, including:
acquiring user data to be processed corresponding to a user to be predicted;
obtaining target characteristic data corresponding to user data to be processed according to the method in any one of the embodiments;
and determining the risk level of the user to be predicted according to the target characteristic data.
In one embodiment, the determining the risk level of the user to be predicted according to the target feature data includes:
acquiring a preset evaluation grade;
and obtaining the risk level of the user to be predicted according to the target characteristic data and the evaluation level.
In a third aspect, the present application also provides a risk assessment apparatus comprising:
the data acquisition module is used for acquiring user data to be processed, and the user data to be processed comprises initial user data at different time points;
the map construction module is used for carrying out map construction on the initial user data of each time point to obtain a plurality of static knowledge maps;
the characteristic extraction module is used for respectively extracting the characteristics of the plurality of static knowledge maps to obtain the static characteristics corresponding to the plurality of static knowledge maps, and the static characteristics are used for representing and fusing the characteristic data of each node characteristic in the static knowledge maps;
the characteristic processing module is used for connecting the static characteristics corresponding to each time point in series to obtain dynamic characteristics;
and the target characteristic calculation module is used for obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics.
In a fourth method, the present application further provides a risk processing apparatus, including:
the system comprises a to-be-processed user data acquisition module, a to-be-processed user data prediction module and a to-be-processed user data prediction module, wherein the to-be-processed user data acquisition module is used for acquiring to-be-processed user data corresponding to a to-be-predicted user;
the risk prediction module is used for obtaining target characteristic data corresponding to the user data to be processed according to the device in any one of the embodiments;
and the risk grade judging module is used for determining the risk grade of the user to be predicted according to the target characteristic data.
In a fifth aspect, the present application further provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the method in any of the above embodiments when executing the computer program.
In a sixth aspect, the present application further provides a computer-readable storage medium. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method in any of the above-mentioned embodiments.
In a seventh aspect, the present application further provides a computer program product. Computer program product comprising a computer program which, when being executed by a processor, carries out the steps of the method of any of the above embodiments.
According to the user data processing method, the user data processing device, the computer equipment and the storage medium, the user data to be processed is firstly acquired, wherein the user data to be processed comprises initial user data at different time points, and then the initial user data at each time point is subjected to map construction to obtain a plurality of static knowledge maps, so that a large amount of data can be stored, and the data can be conveniently extracted. Secondly, because the corresponding static knowledge maps are established for the initial user data of each time point, the dynamic knowledge maps integrated with the time sequence information can be obtained, the change and the trend of the map structure along with the time can be analyzed, so that the key information can be mastered, then the static features corresponding to the static knowledge maps can be obtained by respectively carrying out feature extraction on the static knowledge maps, even if the maps are lost, the static features after feature extraction can be used for completing, then the static features corresponding to each time point are connected in series to obtain the dynamic features, and the target feature data can be obtained by predicting based on the dynamic features, so that the target feature data can be obtained by accurately predicting the next time according to the initial user data of each time point.
Drawings
FIG. 1 is a diagram of an application environment of a user data processing method in one embodiment;
FIG. 2 is a flow diagram illustrating a method for user data processing according to one embodiment;
FIG. 3 is a diagram of triple data in one embodiment;
FIG. 4 is a schematic representation of feature fusion by a first model in one embodiment;
FIG. 5 is a schematic diagram illustrating the operation of forgetting to record the door in another embodiment;
FIG. 6 is a schematic view of the operation of an update door in one embodiment;
FIG. 7 is a schematic flow chart diagram illustrating a risk processing method according to one embodiment;
FIG. 8 is a schematic representation of risk ranking in one embodiment;
FIG. 9 is a schematic diagram of the training and prediction of a personal credit risk rating model in one embodiment;
FIG. 10 is a diagram showing an example of a configuration of a user data processing apparatus;
FIG. 11 is a schematic diagram of a risk management device in one embodiment;
FIG. 12 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The user data processing method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The method comprises the steps of firstly obtaining user data to be processed, wherein the user data to be processed comprises initial user data at different time points, then conducting graph construction on the initial user data at each time point to obtain a plurality of static knowledge graphs, namely, a time point corresponds to one static knowledge graph, conducting feature extraction on the plurality of static knowledge graphs respectively to obtain static features corresponding to the plurality of static knowledge graphs, wherein the static state is used for representing and fusing feature data of each node feature in the static knowledge graphs, connecting the static features corresponding to each time in series to obtain dynamic features, and finally obtaining target feature data corresponding to the user data to be processed according to the dynamic features to achieve accurate evaluation of the user data to be processed. The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, internet of things devices and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart car-mounted devices, and the like. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.
In one embodiment, as shown in fig. 2, a user data processing method is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:
s202, user data to be processed is obtained, and the user data to be processed comprises initial user data at different time points.
The data of the user to be processed refers to data needing to be evaluated, and the data can be data of the user needing to be subjected to loan evaluation, and can also be data needing to be subjected to trust evaluation and risk evaluation. For example, the data of the user to be processed is a user who needs to perform loan assessment, and the data of the user to be processed includes the data of the user's gender, age, property, industry and the like. The user data to be processed may include a plurality of initial user data records, and all the initial user data records are arranged in a time sequence.
Where the initial data refers to raw data without any processing, it may be data directly derived from a database.
Specifically, the server first obtains initial user data at different time points, where one time point may correspond to one piece of initial data, and in other embodiments, one time point may correspond to multiple pieces of initial data, which is not specifically limited herein. Then, the server summarizes and summarizes the acquired initial data at different time points to obtain the user data to be processed, wherein optionally, the initial user data at different time points may be subjected to structured processing to obtain the user data to be processed, and in other embodiments, the initial user data at different time points may be further filtered to filter the invalid information to obtain the user data to be processed.
And S204, performing map construction on the initial user data of each time point to obtain a plurality of static knowledge maps.
The static knowledge graph is a knowledge graph constructed according to initial user data at each time point, for example, a time point corresponds to a knowledge graph, where the knowledge graph refers to a graph constructed according to user data, where a point in the graph represents an entity, and an edge represents a relationship between the entity and the entity, as shown in fig. 3, in particular, fig. 3 is a diagram of triple data in an embodiment, where "liaan" and "youth-assigned fantasy drift" are entities, a "director" represents a relationship between the entities, and a label represents an attribute of the entity, where a label of "liaan" may be "person" and "director", and a label of "youth-assigned fantasy drift" may be "movie".
Specifically, the server processes the initial user data at each time to construct a map, optionally, the initial user data is differentiated by entities, relationships, and the like, and then the map is constructed according to the entities, the relationships, and labels. The structural change of the map can be obtained through the static knowledge maps corresponding to different time points.
In one embodiment, when a node in the static knowledge graph has a tag missing, triple data where the tag missing node is located may be read, and then tag supplementation may be performed on the node according to the triple data where the tag missing node is located. Continuing with fig. 3, if the label of "fantasy drift of youth's pie" is missing, it can be deduced by the label of "liaan".
In one embodiment, the server groups the initial user data according to access frequency and/or information importance degree, places the initial user data with the access frequency being greater than or equal to a first threshold value in a static knowledge graph, and places the initial user data with the access frequency being less than the first threshold value or with the information importance degree being not satisfied in a traditional relational database. Specifically, common information can be stored in the static knowledge map, and information which has low access frequency and is not critical to the relationship analysis is placed in the traditional relational database. In addition, a strong connection graph can be found from the static knowledge graph and marked, the strong connection graph means that each node can reach other points through a certain path, and the strong connection graph indicates that the nodes have strong relation, so that further processing can be performed according to the strong connection graph.
In one embodiment, a time point may establish multiple static knowledge maps from different dimensions, and in particular, if a financial-related static knowledge map is established, the server needs to obtain initial user data of the client, such as identity information, property status, card holding information, transaction information, loan information, repayment information, loan application information, loan records, loan card records, quasi-loan card records, special transaction records, query records, and the like. In addition to information in the financial field, information of the customer social network plays a key role. This is because modern transactions are more and more completed online rather than offline, and therefore, how to grasp the interests and emotions of customers needs to be accomplished by analyzing the behavior data of customers. The text semantic meaning of the text is analyzed through the text data of the client, such as client service feedback information, client evaluation on social media, client survey feedback and the like, the client label is printed, and the corresponding client relation graph is established to obtain better client insight.
And S206, respectively extracting the features of the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps, wherein the static features are used for representing and fusing feature data of each node feature in the static knowledge maps.
The static characteristics refer to characteristics obtained after characteristic extraction is carried out on a static knowledge graph corresponding to a time point, and the characteristics are used for representing and fusing characteristic data of each node characteristic in the static knowledge graph.
Specifically, the server performs feature extraction on the plurality of static knowledge maps, wherein optionally, a pre-trained machine learning model may be used to perform feature extraction on the static knowledge maps to obtain static features after fusion of each node in the static knowledge maps. In one embodiment, the customer feature stored in the static map is X, and after feature extraction, the customer feature becomes Z.
And S208, connecting the static features corresponding to the time points in series to obtain the dynamic features.
The dynamic feature is a feature including time series information, and static features are connected in series according to a time sequence.
Specifically, the server obtains the static features corresponding to the plurality of static knowledge maps, and the static features are connected in series according to the time sequence to obtain the dynamic features.
In one embodiment, if the same time point includes multiple static knowledge maps, for example, when the first time point includes a user-loan product map and a user-user relationship map, after feature extraction is performed on the two maps respectively, static features corresponding to the two maps are obtained, then the two static features are spliced, at the next time point, the static features are spliced according to the same method, and then the static features are connected in series according to the time point sequence, so that the dynamic features are obtained.
In an embodiment, if the same time point includes a plurality of static knowledge maps, the plurality of static knowledge maps may be spliced in the plurality of static knowledge maps according to a map building manner, that is, an "entity-relationship-entity" manner, that is, after the plurality of static knowledge maps are spliced into one static knowledge map, feature extraction may be performed to obtain static features, and then the static features may be connected in series according to a time sequence to obtain dynamic features.
In one embodiment, if the same time point includes a plurality of static knowledge maps, the plurality of static knowledge maps at the same time may be respectively subjected to feature extraction to obtain a plurality of static features of the time point, and then the plurality of static features corresponding to each time point are connected in series according to the sequence of occurrence of the time to obtain the dynamic features.
And S210, obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics.
The target characteristic data is a prediction result used for representing the user data to be processed after the user data to be processed is processed.
Specifically, the server predicts the user data to be processed according to the dynamic features to obtain target feature data corresponding to the user data to be processed, wherein optionally, the dynamic features may be input into a machine learning model trained in advance, and the machine learning model may predict features of a next time point according to the dynamic features, that is, the target feature data.
In one embodiment, if the personal credit risk level is predicted, map construction can be performed according to user data to obtain a plurality of static knowledge maps, feature extraction is performed on the plurality of static knowledge maps to obtain static features, then the static features are connected in series according to a time sequence to obtain dynamic features, finally target feature data is obtained according to dynamic feature prediction, namely risk scores obtained by prediction of the user are combined, and corresponding loan interest rates can be formulated for the client according to the characteristics of loan products. And meanwhile, subsequent investigation feedback is carried out on the client, and the map information is updated in real time so as to optimize the prediction of the personal credit risk level.
In the user data processing method, user data to be processed is firstly acquired, wherein the user data to be processed comprises initial user data at different time points, and then the initial user data at each time point is subjected to map construction to obtain a plurality of static knowledge maps, so that a large amount of data can be stored and the data can be conveniently extracted. Secondly, because the corresponding static knowledge maps are established for the initial user data of each time point, the dynamic knowledge maps integrated with the time sequence information can be obtained, the change and the trend of the map structure along with the time can be analyzed, so that the key information can be mastered, then the static features corresponding to the static knowledge maps can be obtained by respectively carrying out feature extraction on the static knowledge maps, even if the maps are lost, the static features after feature extraction can be used for completing, then the static features corresponding to each time point are connected in series to obtain the dynamic features, and the target feature data can be obtained by predicting based on the dynamic features, so that the target feature data can be obtained by accurately predicting the next time according to the initial user data of each time point.
In an embodiment, the profiling the initial user data at each time point to obtain a plurality of static knowledge maps includes: extracting initial user data of each time point to obtain triple data; and carrying out map construction based on the triple data to obtain a plurality of static knowledge maps.
The triple data is a data group used for constructing a static knowledge graph, and at least comprises two entities and relationship information between the entities, wherein optionally, the entities can be obtained after word segmentation processing is carried out on user data.
Specifically, the initial user data at each time is extracted to obtain a plurality of triple data, wherein optionally, word segmentation processing may be performed on the initial user data to obtain word segmentation data, and according to a preset entity information table, a relationship information table and a tag system information table, it is determined whether each word segmentation is an entity, or a relationship and a tag of the entity, and entity-relationship-entity connection is performed in a certain direction to obtain triple data.
Specifically, the obtained triple data are still connected according to an entity-relationship-entity to obtain a static knowledge graph.
In one embodiment, the server may extract the initial user data at each time in time sequence to obtain a plurality of triple data, and perform map construction based on the triple data to obtain a plurality of static knowledge maps corresponding to each time point.
In one embodiment, the server has a plurality of threads, the plurality of threads extract initial user data at different time points simultaneously to obtain a plurality of ternary group data, and map construction is performed based on the ternary group data to obtain a plurality of static knowledge maps corresponding to the time points simultaneously.
In the above embodiment, the server extracts the initial user data at each time point to obtain triple data, and performs map construction based on the triple data to obtain a plurality of static knowledge maps, so that the plurality of static knowledge maps can be analyzed, and target feature data is obtained by predicting user data to be processed.
In an embodiment, after the graph construction based on the triple data to obtain the plurality of static knowledge graphs, the method includes: and when the nodes in the static knowledge graph have label missing, reading the ternary group data where the label missing nodes are located, and supplementing the label missing nodes according to the labels of the ternary group data.
The labels refer to attributes of nodes in the static knowledge graph, for example, if an entity is "Lian", the labels of the entity may be "people", "director", and the like, and thus, according to the structure of entity-relationship-entity, the structure of label-relationship-label may be obtained, so as to facilitate deeper analysis, where a point to be explained may be one or more labels corresponding to one entity.
Specifically, when any node in the static knowledge graph has a tag missing part, the triple data of the node with the tag missing part can be read, and then the tag of the node can be supplemented according to the triple data of the node with the tag missing part. For example, if A and B are triple data, and the tag of A is missing, then the tag of B can be used as the tag of A.
In an embodiment, when there are a plurality of triple data related to the entity with a missing label, the triple data may be supplemented according to the labels of the plurality of triple data, for example, the labels of the plurality of triple data are subjected to means such as feature fusion or key feature extraction, so as to obtain the label of the node with a missing label.
In the embodiment, the label of the label-missing node is deduced and supplemented by using the relation between the label-containing node and the label-missing node, so that the data can be more perfect, and the subsequent prediction of the user data to be processed is more accurate.
In an embodiment, the extracting the features of the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps respectively includes: obtaining a characteristic matrix corresponding to the plurality of static knowledge maps according to the plurality of static knowledge maps; and performing feature fusion on each feature matrix to obtain static features corresponding to the plurality of static knowledge maps.
The feature matrix is a matrix obtained according to a static knowledge graph, for example, N nodes are provided in a static knowledge graph, each node has its own feature, and the features of the nodes can form an N × D feature matrix X, where D identifies the dimension value of the feature.
Specifically, the server firstly extracts the plurality of static knowledge maps to obtain feature matrices corresponding to the plurality of static meaning maps, and then performs feature fusion on each feature matrix to obtain static features corresponding to the plurality of static knowledge maps, wherein optionally, the feature matrices may be subjected to feature fusion through a pre-trained machine learning model to obtain the static features corresponding to the plurality of static knowledge maps.
In one embodiment, a first model capable of performing feature fusion on a feature matrix may be obtained by training a convolutional neural network (a neural network model, GCN), specifically, there are N nodes in a static knowledge graph, features of the nodes may form an N × D feature matrix X and relationships between the nodes form an N × N adjacency matrix a, and then the feature matrix X and the adjacency matrix a are input into the first model, where propagation modes between network layers are as follows:
Figure BDA0003511034460000111
wherein the content of the first and second substances,
Figure BDA0003511034460000112
INis an identity matrix.
Figure BDA0003511034460000113
Is that
Figure BDA0003511034460000114
The degree matrix of (c) is,
Figure BDA0003511034460000115
Figure BDA0003511034460000116
is a feature of the l-th layer, H(0)=X。W(l)Is a parameter of each layer. σ is a nonlinear activation function.
In fig. 4, fig. 4 is a schematic diagram of feature fusion performed by the first model in an embodiment, where the stored client features are X, and the features of each client are changed into Z by several layers of graph neural networks. In the embodiment, the first model adopts a two-layer graph convolution neural network, the activation functions respectively adopt a ReLU function and a Softmax function, and the overall forward propagation formula is
Figure BDA0003511034460000121
Figure BDA0003511034460000122
z=∑iexp(xi)
Cross entry loss function is calculated for all labeled nodes:
Figure BDA0003511034460000123
wherein y isLIs a collection of labeled nodes. Network parameter W(0)And W(1)The training of (2) adopts a gradient descent method.
In one embodiment, a neural network (a neural network model, GNN) may be trained to obtain a first model that enables feature fusion of the feature matrix.
In the above embodiment, the missing data in the static knowledge graph can be complemented by using the information on the nodes and edges through feature fusion.
In an embodiment, the obtaining target feature data corresponding to the user data to be processed according to the dynamic feature includes: carrying out feature extraction on the dynamic features to obtain retained features; updating the retained characteristic to obtain an updated characteristic; and calculating to obtain target characteristic data according to the updated characteristics.
The retention feature refers to a feature obtained by extracting a feature of a dynamic feature, which can determine which information can be left through a preset mask, and also can determine which information needs to be "forgotten" through a forgetting function (forgetting gate) in a pre-trained prediction model, that is, the information needs to be left and discarded; the updated feature refers to a feature obtained after updating based on the retained feature.
Specifically, after the server acquires the dynamic features, firstly, feature extraction is performed on the dynamic features to obtain retained features, then, updating is performed on the basis of the retained features to obtain updated features, optionally, the retained features can be updated through an updating gate of a pre-trained prediction model to obtain updated features, and finally, target feature data are calculated according to the updated features, wherein optionally, the target feature data can be obtained by calculating through the state of the previous moment and the input of the current moment.
In one embodiment, the target feature data may be derived for the dynamic features by a pre-trained predictive model. Specifically, dynamic features, namely, node structures and internal structures of the graph at different time instants are used as input of a pre-trained prediction model. The first step of the pre-trained predictive model is to determine the information to discard by forgetting the door level, which will read ht-1And xtOutputting a value between 0 and 1 to each module state Ct-1The numbers in (1). 1 means "complete retention" and 0 means "complete discard". Specifically, referring to fig. 5, fig. 5 is a schematic diagram illustrating the operation of forgetting to record the door in one embodiment, and the operation is as follows:
ft=σ(Wf·[ht-1,xt]+bf)
wherein h ist-1Represents the state at time t-1, xtIndicating input at time t, ftForgetting gate showing time t, bfConstant term representing forgetting to remember the gate, WfA weight parameter indicating forgetting to remember the gate.
The next step in the pre-trained predictive model is to determine which new information is deposited in the module. This involves two parts, the sigmood layer, the "input gate layer", determining which parameter values need to be updated and the tanh layer, creating a new candidate value(Vector)
Figure BDA0003511034460000131
Add to the state.
it=σ(Wi·[ht-1,xt]+bi)
Figure BDA0003511034460000132
Wherein itUpdate gate indicating time t, biConstant term representing the update gate, bCRepresenting a vector of candidate values
Figure BDA0003511034460000133
Constant term of (1), WiWeight parameter, W, representing the update gateCRepresenting a vector of candidate values
Figure BDA0003511034460000134
The weight parameter of (2).
At this time, Ct-1Is updated to Ct. Then the old state is compared with ftMultiplying and discarding the information needing to be discarded. In addition, the
Figure BDA0003511034460000135
With specific reference to fig. 6, fig. 6 is a schematic diagram illustrating the operation of the refresh gate in an embodiment, and the operation is as follows:
Figure BDA0003511034460000136
wherein, CtIndicating the state of memory at the current time, and is also the state of the memory cells after update.
Finally, the output value h is determinedt
ot=σ(Wo[ht-1,xt]+bo)
ht=ot*tanh(Ct)
Wherein o istOutput gate representing time t, boAnd a constant term representing the output gate, and Wo represents a weight parameter of the output gate.
At this point, the prediction of the dynamic characteristics is completed.
In the embodiment, the retained feature is obtained by performing feature extraction on the dynamic feature, the retained feature is updated to obtain the updated feature, and finally, the target feature data can be accurately obtained by calculating according to the updated feature.
In one embodiment, the above-mentioned respectively performing feature extraction on the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps is implemented by a pre-trained first model; obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics is realized through a pre-trained second model; the training process of the first model and the second model comprises the following steps: acquiring sample data, wherein the sample data carries marked data and a characteristic label; inputting sample data into a first model, and extracting the sample data by the first model to obtain sample characteristics; and inputting the sample characteristics into a second model, and predicting the sample characteristics by the second model to obtain sample target characteristics of the sample characteristics.
The first model is a machine learning model for extracting features of the static knowledge graph to obtain static features, and can be obtained by training based on any machine learning model capable of processing the graph, such as GCN, GNN and the like; the second model is obtained by training a machine learning model for predicting static features obtained by the first model, such as RNN, LSTM, GRU, and the like.
The sample data refers to training data used for training the first model and the second model, which may be captured from a preset database table or data published and captured from the internet, and a plurality of static knowledge maps are established according to the data, wherein the feature labels refer to labels of nodes in the sample data, namely labels of nodes in the plurality of static knowledge maps, and can be used for calculating a loss function of the first model, namely a first loss function; the labeled data is a real label of sample data labeled to the sample data in advance, and can be used for calculating a loss function of the second model, namely the second loss function, so as to guide the model to be optimized; the sample target feature refers to data predicted by the second model based on sample data.
Specifically, sample data is obtained first, the sample data carries labeling data, the labeling data is used for labeling the sample data in advance, for example, if the personal credit risk level of the user needs to be predicted, the labeling data is the score of the client, and in other embodiments, the labeling data is labeled according to the actual use scene; then, the sample data is input into the first model, wherein optionally, a plurality of static knowledge maps need to be constructed according to the sample data before the sample data is input into the first model, because the change of the data can be observed more clearly based on the knowledge maps, and the analysis and prediction of the data are more beneficial based on the knowledge maps. And then inputting the plurality of static knowledge maps into a first model, preferably, the first model is a graph convolution neural network, the graph convolution neural network is a neural network model specially used for processing graph structure data, and different neurons in a deep neural network are used for respectively learning topological information and attribute feature information in the graph data and integrating the topological information and the attribute feature information, so that more precise feature representation of nodes or substructures can be obtained, namely, sample features after feature extraction are obtained.
Specifically, after the sample characteristics are obtained, the sample characteristics are input into the second model, wherein optionally, before the sample characteristics are input into the second model, the characteristic vectors corresponding to the plurality of static knowledge maps need to be connected in series according to a time sequence to obtain dynamic sample characteristics, the dynamic sample characteristics are input into the second model, the second model predicts according to the sample characteristics to obtain sample target characteristics of the sample characteristics, illustratively, the sample characteristics at a time point of t to t +3 are input, and the sample target characteristics at a time point of t +4 are obtained by predicting through the second model.
In one embodiment, a first model calculates a first target loss function according to sample characteristics and characteristic labels in sample data, the first target loss function guides optimization of the first model until the first model is trained, a second model sample target characteristic and labeling data calculate an updated characteristic model branch to obtain a second target loss function, and the second loss function is used for guiding optimization of the updated characteristic model branch until the updated characteristic model branch is trained.
In an embodiment, before the sample data is input into the first model, the feature tags missing from the nodes in the sample data may be supplemented, specifically, the triple data where the feature tag missing nodes are located may be read, and the nodes where the feature tags are missing may be supplemented according to the tags of the triple data.
In the above embodiment, by training the first model and the second model, the first model capable of performing feature extraction on the data features in the static knowledge graph and the second model capable of predicting the features after feature extraction of the first model can be obtained.
In an embodiment, the inputting the sample data into the first model, and after the first model extracts the sample data and obtains the sample feature, the method further includes: calculating a first target loss function according to the sample characteristics and the characteristic labels in the sample data, wherein the first target loss function is used for guiding and optimizing the first model until the first model is trained; inputting the sample characteristics into a second model, predicting the sample characteristics by the second model, and obtaining sample target characteristics of the sample characteristics, wherein the method further comprises the following steps: and calculating the updated characteristic model branch according to the sample target characteristic and the labeling data to obtain a second target loss function, wherein the second loss function is used for guiding optimization of the updated characteristic model branch until the updated characteristic model branch is trained.
The first loss function is a function for guiding parameters in the first model to be optimized, and can be obtained by calculation according to a true value and a predicted value, namely a feature tag in sample data and sample features; the second loss function is a function for guiding the optimization of the parameters in the second model, and can be calculated according to the expression data and the sample target characteristics.
Specifically, the first model calculates to obtain the first target loss function according to the sample feature after feature extraction and the feature tag in the sample data, that is, the tag carried by the node in the static knowledge graph, where optionally, the first target loss function may be calculated according to the sample feature, the feature tag in the sample data, and the cross entry loss function, it should be noted that in other embodiments, the first target loss function may be calculated according to the sample feature, the feature tag in the sample data, and any one of the loss functions.
Specifically, the second model calculates to obtain a second target loss function according to the annotation data and the sample target feature, wherein optionally, the second target loss function may be calculated according to the annotation data, the sample target feature and the coordinated _ cross control function, it should be noted that in other embodiments, the second target loss function may be calculated according to the annotation data, the sample target feature and any one of the loss functions. In other embodiments, if a personal credit risk rating model needs to be trained, the labeled data of the sample data is the real credit score, so that the second objective loss function can be calculated by using the labeled data of the sample data and the target features of the sample.
In the above embodiment, the first model and the second model can be optimized by calculating the first target loss function and the second loss functions, respectively, so that the parameters of the trained first model and second model are more accurate, in an embodiment, as shown in fig. 7, a risk processing method is provided, which includes the following steps:
s702, user data to be processed corresponding to the user to be predicted is obtained.
The user to be predicted refers to a client needing trust evaluation, the user data to be processed refers to corresponding basic information of the user to be predicted, such as identity information, property conditions, card holding information, transaction information, payment information, repayment information, loan application information, loan record, credit card record, quasi-credit card record, special transaction record, query record and other related data and behavior data of the client, and because the information of the client social network plays a key role in addition to the information of the financial field.
Specifically, the server may capture and acquire the to-be-processed user data corresponding to the to-be-predicted user from the client information base database, and establish a plurality of static knowledge maps according to the to-be-processed user data corresponding to the to-be-predicted user.
S704, target characteristic data corresponding to the user data to be processed is obtained according to the method in any one of the embodiments.
The target characteristic data is credit scores of the clients in a credit scene, credit levels of the clients can be further obtained according to the credit scores, and optionally the credit levels of the users to be predicted can be determined through preset evaluation levels and the target characteristic data. Specifically, according to the method in any one of the above implementations, the target feature data corresponding to the to-be-processed user data is obtained based on a plurality of static knowledge maps corresponding to the to-be-predicted user.
In an embodiment, a plurality of static knowledge maps corresponding to a user to be predicted are input into a pre-trained personal credit risk rating model, wherein the pre-trained personal credit risk rating model includes a first model and a second model, a specific training process may refer to any one of the above embodiments, which is not repeated herein, and a credit score of the user to be predicted can be obtained through prediction of the personal credit risk rating model.
And S706, determining the risk level of the user to be predicted according to the target characteristic data.
The risk level refers to a quantitative standard for judging the risk of the client, for example, the risk level may be a trust security user, a credit credible user, a credit suspicious user, a credit incredible user, and the like.
Specifically, after target characteristic data corresponding to the user data to be processed is obtained, the server determines the risk level of the user to be predicted according to the target characteristic data, wherein optionally, the credit level of the user to be predicted can be determined through a preset evaluation level and the target characteristic data.
In an embodiment, specifically referring to fig. 8, fig. 8 is a schematic diagram of risk rating in an embodiment, first collecting customer information, and then establishing a plurality of static knowledge maps, i.e., dynamic knowledge maps, according to the customer information, where the static knowledge maps include two types of maps, i.e., a user-loan relationship map and a user-user relationship map, respectively, because modern transactions are more and more completed online rather than offline, how to grasp customer interests and customer moods needs to be accomplished more and more by analyzing customer behavior data, and therefore, a user-user relationship map needs to be established while establishing a user-loan relationship map, so as to obtain better customer insight. Then the dynamic knowledge graph can obtain static characteristics through the first model, namely the results and internal characteristics of the nodes in the graph at each moment t are Node Embedding, then the static characteristics are connected in series to obtain dynamic characteristics, the dynamic characteristics are input into the second model, the dynamic characteristics pass through the second model to obtain target characteristic data, and at the moment, the prediction of the personal risk level model is completed. The process of model training is similar to the process of model use, except that the labeled data of sample data is known, and an objective loss function optimization model branch is needed, that is, a first objective loss function optimization first model and a second objective loss function optimization second model are needed until the first model and the second model complete training, and at this moment, training of the personal credit risk level model is completed.
In one embodiment, the predicted target characteristic data of the predicted user can be put into a data table with pre-selected settings, and the data is updated.
In the embodiment, the prediction can be carried out according to the data of the user to be processed corresponding to the user to be predicted, the risk level can be accurately obtained, so that the loan interest rate can be formulated according to the risk level, and for banks, the risk caused by the credit default of customers can be greatly reduced.
In one embodiment, the determining the risk level of the user to be predicted according to the target feature data includes: acquiring a preset evaluation grade; and obtaining the risk level of the user to be predicted according to the target characteristic data and the evaluation level.
The evaluation level refers to a quantitative standard for measuring the credit degree of the user to be predicted, and can be set according to an actual use scene.
Specifically, the server first obtains a preset evaluation level, where the evaluation level includes credit levels corresponding to respective credit equal scores in detail, and then obtains a risk level of the predicted user according to the target feature data and the evaluation level, in one embodiment, the preset evaluation level is 85-100, and the target feature data is 92, and the risk level of the user is a trust security user.
In the embodiment, the risk level of the user to be predicted is further obtained through the preset evaluation level, so that the corresponding loan interest rate can be formulated for the client according to the risk level and the characteristics of the loan product.
In one embodiment, in conjunction with fig. 9, fig. 9 is a schematic diagram of training and predicting a personal credit risk rating model in one embodiment, including the following steps:
the method comprises the following steps: and constructing a customer information base based on the dynamic knowledge graph. Basic information of a client is acquired, such as identity information, property condition, card holding information, transaction information, loan information, repayment information, loan application information, loan records, credit card records, quasi credit card records, special transaction records, query records and other related data. In addition to information in the financial field, information of the customer social network plays a key role. Modern transactions are more and more completed online rather than offline, and therefore, how to grasp customer interests and customer moods more and more needs to analyze customer behavior data to complete. The text semantic meaning is analyzed through the text data of the client, such as client service feedback information, client evaluation on social media, client survey feedback and the like, the client label is printed, and the user image is established. Meanwhile, better customer insight is obtained by establishing a customer relation graph in combination with a knowledge graph technology. This includes customer interest insights, i.e., interest in loan products for personalized product recommendations, precision marketing, etc., and customer attitude insights, i.e., company and service satisfaction, improvement opinions, etc., to quickly respond to customer problems, improve customer experience, strengthen customer connections, and improve customer loyalty. The above analysis is based on static relational maps, and does not consider that the structure of the maps changes over time, but the changes are associated with risks. Therefore, time sequence information is required to be added when the knowledge graph is constructed, and a dynamic knowledge graph is constructed. The main role of the knowledge-graph is also to analyze relationships, especially depth relationships.
Step two: feature extraction based on the graph convolution neural network and the recurrent nerve. And (4) decomposing the dynamic knowledge graph obtained in the step one into two parts for processing. First, feature extraction is performed on the static knowledge graph. Then, the obtained node structures and internal features at different time are connected in series to learn dynamic characteristics.
As a graph structure, the static knowledge graph may be subjected to feature extraction by a first model, which is a GCN in this embodiment. Firstly, the dynamic knowledge graph obtained in the first step is used as the input of a graph convolution neural network, the graph structure of the graph comprises N nodes, each node has own characteristics, the characteristics of the nodes form an N X D-dimensional matrix X, and the relationship among the nodes forms an N X N matrix A which also becomes an adjacent matrix. The matrix X and matrix A are the inputs to the model. Static characteristics corresponding to the static knowledge maps can be obtained through the first model, and the static characteristics are connected in series to obtain dynamic characteristics.
The structure and the internal characteristics of the graph nodes at each moment t can be obtained through the graph convolution neural network, and the dynamic characteristics are input into the second model. LSTM is a good choice for the processing of timing information, so the second model is LSTM in this embodiment. The LSTM input contains not only the current information but also previous information, i.e. it can be used to connect previous information to the current task, e.g. to use past information to infer an understanding of the current information. In addition, LSTM solves the long-term dependence problem relative to the underlying RNN. The problem of long term dependence is due to the increasing interval, the RNN losing the ability to learn information so far connected. Target feature data can be predicted through the LSTM model.
Step three: training of the personal credit risk rating model. And (5) taking the client characteristics obtained in the step two as the input of the convolutional neural network, taking the personal credit risk level of the client as the labeled data, and training the model to obtain the personal credit risk level evaluation model.
Step four: prediction of a personal credit risk rating model. When handling the personal loan service, firstly, the personal credit risk level of the client needs to be known, and at this time, the personal credit risk level of the client can be predicted only by inputting the dynamic knowledge map corresponding to the client into the model.
Step five: and (5) intelligent interest rate pricing. In the personal loan service, when a client selects a certain loan product, the personal credit risk level of the client obtained in the fourth step is utilized, and the corresponding loan interest rate can be established for the client by combining the characteristics of the loan product. And meanwhile, subsequent investigation feedback is carried out on the client, map information is updated in real time, and the model is further optimized.
In the above embodiment, the time sequence information is merged into the client information when the key features are extracted, and the personal credit risk level of the client is assessed by using the information in the knowledge graph at different time. The processing of the dynamic knowledge graph mainly adopts graph convolution neural network technology and circular neural network technology. The time sequence information is fused, the accuracy of personal credit risk level assessment can be greatly improved through the characteristics of the graph convolution neural network and the LSTM after processing, the precision of loan intelligent interest rate pricing is further improved, and for banks, the risk caused by the default of the credit of customers can be greatly reduced. In addition, the differentiated pricing of different passenger groups can also improve the application willingness of new passengers, awaken and activate sleeping customers, promote the retailing of inventory customers, enhance the fine customer management capability of products and improve the comprehensive business contribution of customers; in addition, aiming at the problem that the storage and processing of financial data cannot meet the current requirements, static knowledge maps corresponding to a plurality of time points are constructed, namely, dynamic knowledge map maps store information of customers at different moments, and maps integrated with time sequence information play a role in improving the accuracy of subsequent personal credit risk level assessment and loan intelligent interest rate pricing. In addition, after the corresponding loan interest rate is made for the client, the data of the client is updated to the map in real time through subsequent information tracking, and the effect of the model is further optimized and improved; thirdly, the GCN network and the LSTM network are combined to be used as a feature extractor, so that the features of the graph structure can be accurately and quickly extracted, and time sequence information can be integrated into the graph structure so as to be convenient for analysis.
It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides a user data processing apparatus for implementing the user data processing method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the user data processing device provided below can refer to the limitations of the user data processing method in the foregoing, and details are not described here.
In one embodiment, as shown in fig. 10, there is provided a user data processing apparatus including: a data acquisition module 100, a map construction module 200, a feature extraction module 300, a feature processing module 400, and a target feature calculation module 500, wherein:
the data obtaining module 100 is configured to obtain user data to be processed, where the user data to be processed includes initial user data at different time points.
The map building module 200 is configured to perform map building on the initial user data at each time point to obtain a plurality of static knowledge maps.
The feature extraction module 300 is configured to perform feature extraction on the multiple static knowledge maps to obtain static features corresponding to the multiple static knowledge maps, where the static features are used to represent feature data of each node feature in the fused static knowledge map.
And the feature processing module 400 is configured to serially connect the static features corresponding to the time points to obtain a dynamic feature.
And the target feature calculating module 500 is configured to obtain target feature data corresponding to the to-be-processed user data according to the dynamic features.
In one embodiment, the above-mentioned map building module 200 includes:
and the triple extraction unit is used for extracting the initial user data of each time point to obtain triple data.
And the static map construction unit is used for carrying out map construction based on the triple data to obtain a plurality of static knowledge maps.
In an embodiment, the above-mentioned map building module 200 further includes:
and the label supplementing unit is used for reading the ternary group data of the label missing node when the label missing exists in the node in the static knowledge graph, and supplementing the label missing node according to the label of the ternary group data.
In one embodiment, the above feature extraction module 300 includes:
and the characteristic matrix extraction unit is used for obtaining characteristic matrixes corresponding to the static knowledge maps according to the static knowledge maps.
And the static characteristic extracting unit is used for carrying out characteristic fusion on each characteristic matrix to obtain the static characteristics corresponding to the plurality of static knowledge maps.
In one embodiment, the feature processing module 400 includes:
and the retention feature extraction unit is used for carrying out feature extraction on the dynamic features to obtain retention features.
The updating feature extraction unit is used for updating the retained features to obtain updating features;
and the characteristic calculating unit is used for calculating and obtaining target characteristic data according to the updated characteristics.
In one embodiment, the above apparatus further comprises:
and the sample acquisition module is used for acquiring sample data, and the sample data carries the marking data and the characteristic label.
And the sample characteristic acquisition module is used for inputting the sample data into the first model, and the first model extracts the sample data to obtain the sample characteristics.
And the first model optimization module is used for calculating a first target loss function according to the sample characteristics and the characteristic labels, and the first target loss function is used for optimizing the first model until the first model is trained.
And the sample target characteristic calculation module is used for inputting the sample characteristics to the second model, and the second model predicts the sample characteristics to obtain the sample target characteristics of the sample characteristics.
And the second model optimization module is used for calculating to obtain a second target loss function according to the sample target characteristics and the labeled data, and the second loss function is used for optimizing the second model until the second model completes training.
In one embodiment, as shown in fig. 11, there is provided a risk processing apparatus including: a pending user data obtaining module 600, a risk prediction module 700, and a risk level determination module 800, wherein:
a to-be-processed user data obtaining module 600, configured to obtain to-be-processed user data corresponding to a to-be-predicted user.
A risk prediction module 700, configured to obtain target feature data corresponding to the to-be-processed user data according to the apparatus in any of the above embodiments.
And a risk level judging module 800, configured to determine a risk level of the user to be predicted according to the target feature data.
In an embodiment, the risk level determining module 800 includes:
an evaluation level acquisition unit for acquiring a preset evaluation level.
And the grade judging unit is used for obtaining the risk grade of the user to be predicted according to the target characteristic data and the evaluation grade.
The various modules in the user data processing apparatus and risk processing apparatus described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing the user data to be processed and the user data to be processed corresponding to the user to be predicted. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a user data processing method and a risk processing method.
Those skilled in the art will appreciate that the architecture shown in fig. 12 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above-described method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware instructions of a computer program, which may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
All possible combinations of the technical features in the above embodiments may not be described for the sake of brevity, but should be considered as being within the scope of the present disclosure as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (14)

1. A method of processing user data, the method comprising:
acquiring user data to be processed, wherein the user data to be processed comprises initial user data at different time points;
performing map construction on the initial user data of each time point to obtain a plurality of static knowledge maps;
respectively extracting the features of the static knowledge graphs to obtain static features corresponding to the static knowledge graphs, wherein the static features are used for representing and fusing feature data of each node feature in the static knowledge graphs;
connecting the static characteristics corresponding to each time point in series to obtain dynamic characteristics;
and obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics.
2. The method of claim 1, wherein the profiling initial user data at each of the time points to obtain a plurality of static knowledge maps comprises:
extracting the initial user data of each time point to obtain triple data;
and carrying out map construction based on the triple data to obtain a plurality of static knowledge maps.
3. The method of claim 2, wherein after the profiling based on the triple data to obtain a plurality of static knowledge maps, the method comprises:
when the nodes in the static knowledge graph have label missing, reading ternary group data where the label missing nodes are located, and supplementing the label missing nodes according to labels of the ternary group data.
4. The method according to claim 1, wherein the extracting features of the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps comprises:
obtaining a plurality of characteristic matrixes corresponding to the static knowledge maps according to the plurality of static knowledge maps;
and performing feature fusion on each feature matrix to obtain static features corresponding to the plurality of static knowledge maps.
5. The method according to claim 1, wherein the obtaining target feature data corresponding to the user data to be processed according to the dynamic feature comprises:
performing feature extraction on the dynamic features to obtain the retention features;
updating the retention feature to obtain an updated feature;
and calculating to obtain the target characteristic data according to the updated characteristic.
6. The method according to claim 1, wherein the feature extraction of the plurality of static knowledge maps to obtain the static features corresponding to the plurality of static knowledge maps is implemented by a pre-trained first model;
and obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics is realized through a pre-trained second model.
7. The method of claim 6, wherein the training process of the first model and the second model comprises:
acquiring sample data, wherein the sample data carries marked data and a characteristic label;
inputting the sample data into a first model, and extracting the sample data through the first model to obtain sample characteristics;
calculating a first target loss function according to the sample characteristics and the characteristic labels, wherein the first target loss function is used for optimizing the first model until the first model is trained;
inputting the sample characteristics into a second model, and predicting the sample characteristics through the second model to obtain sample target characteristics of the sample characteristics;
and calculating to obtain a second target loss function according to the sample target characteristics and the labeling data, wherein the second loss function is used for optimizing the second model until the second model is trained.
8. A method of risk management, the method comprising:
acquiring user data to be processed corresponding to a user to be predicted;
obtaining target characteristic data corresponding to the user data to be processed according to the method of any one of claims 1 to 7;
and determining the risk level of the user to be predicted according to the target characteristic data.
9. The method of claim 8, wherein determining the risk level of the user to be predicted according to the target feature data comprises:
acquiring a preset evaluation grade;
and obtaining the risk level of the user to be predicted according to the target characteristic data and the evaluation level.
10. A risk assessment device, characterized in that said device comprises:
the data acquisition module is used for acquiring user data to be processed, wherein the user data to be processed comprises initial user data at different time points;
the map construction module is used for carrying out map construction on the initial user data of each time point to obtain a plurality of static knowledge maps;
the characteristic extraction module is used for respectively extracting the characteristics of the static knowledge maps to obtain the static characteristics corresponding to the static knowledge maps, and the static characteristics are used for representing and fusing the characteristic data of each node characteristic in the static knowledge maps;
the characteristic processing module is used for connecting the static characteristics corresponding to the time points in series to obtain dynamic characteristics;
and the target characteristic calculation module is used for obtaining target characteristic data corresponding to the user data to be processed according to the dynamic characteristics.
11. A risk processing apparatus, characterized in that the apparatus comprises:
the system comprises a to-be-processed user data acquisition module, a to-be-processed user data prediction module and a to-be-processed user data prediction module, wherein the to-be-processed user data acquisition module is used for acquiring to-be-processed user data corresponding to a to-be-predicted user;
a risk prediction module, configured to obtain target feature data corresponding to the to-be-processed user data according to the apparatus of claim 10;
and the risk grade judging module is used for determining the risk grade of the user to be predicted according to the target characteristic data.
12. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 9 when executing the computer program.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 9.
14. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 9 when executed by a processor.
CN202210152732.7A 2022-02-18 2022-02-18 User data processing method, device, computer equipment and storage medium Pending CN114529399A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210152732.7A CN114529399A (en) 2022-02-18 2022-02-18 User data processing method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210152732.7A CN114529399A (en) 2022-02-18 2022-02-18 User data processing method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114529399A true CN114529399A (en) 2022-05-24

Family

ID=81622211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210152732.7A Pending CN114529399A (en) 2022-02-18 2022-02-18 User data processing method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114529399A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545300A (en) * 2022-09-30 2022-12-30 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Method and device for predicting user behavior based on graph neural network

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115545300A (en) * 2022-09-30 2022-12-30 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Method and device for predicting user behavior based on graph neural network

Similar Documents

Publication Publication Date Title
US20180253657A1 (en) Real-time credit risk management system
TW202022769A (en) Risk identification model training method and device and server
CN111191092B (en) Label determining method and label determining model training method
JP7017149B2 (en) Information processing equipment, information processing method and information processing program using deep learning
CN112015909B (en) Knowledge graph construction method and device, electronic equipment and storage medium
CN111582538A (en) Community value prediction method and system based on graph neural network
CN115080868A (en) Product pushing method, product pushing device, computer equipment, storage medium and program product
CN114529399A (en) User data processing method, device, computer equipment and storage medium
CN117312657A (en) Abnormal function positioning method and device for financial application, computer equipment and medium
CN113988878B (en) Graph database technology-based anti-fraud method and system
CN116764236A (en) Game prop recommending method, game prop recommending device, computer equipment and storage medium
CN114693409A (en) Product matching method, device, computer equipment, storage medium and program product
Cheng et al. A quarterly time-series classifier based on a reduced-dimension generated rules method for identifying financial distress
CN115630221A (en) Terminal application interface display data processing method and device and computer equipment
CN115169637A (en) Social relationship prediction method, device, equipment and medium
CN115344794A (en) Scenic spot recommendation method based on knowledge map semantic embedding
US20140324524A1 (en) Evolving a capped customer linkage model using genetic models
CN115080856A (en) Recommendation method and device and training method and device of recommendation model
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN114219184A (en) Product transaction data prediction method, device, equipment, medium and program product
CN113792220A (en) Target object recommendation method and device, computer equipment and storage medium
CN113254775A (en) Credit card product recommendation method based on client browsing behavior sequence
Jalilifard et al. Friendship is all we need: A multi-graph embedding approach for modeling customer behavior
CN111401641A (en) Service data processing method and device and electronic equipment
de Oliveira Monteiro et al. Market prediction in criptocurrency: A systematic literature mapping

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination