CN116883157A - Small sample credit assessment method and system based on metric learning - Google Patents

Small sample credit assessment method and system based on metric learning Download PDF

Info

Publication number
CN116883157A
CN116883157A CN202311148690.0A CN202311148690A CN116883157A CN 116883157 A CN116883157 A CN 116883157A CN 202311148690 A CN202311148690 A CN 202311148690A CN 116883157 A CN116883157 A CN 116883157A
Authority
CN
China
Prior art keywords
network
class
ternary
user
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311148690.0A
Other languages
Chinese (zh)
Inventor
许扬汶
韩冬
刘天鹏
李楠
孟祥宇
顾阜城
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Big Data Group Co ltd
Original Assignee
Nanjing Big Data Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Big Data Group Co ltd filed Critical Nanjing Big Data Group Co ltd
Priority to CN202311148690.0A priority Critical patent/CN116883157A/en
Publication of CN116883157A publication Critical patent/CN116883157A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Business, Economics & Management (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Finance (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Probability & Statistics with Applications (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a small sample credit evaluation method and a system based on metric learning, wherein the method comprises the following steps: analyzing and preprocessing data; performing a small sample classification task on each epoode by using a task-based epoode training strategy, calculating a class-related prototype by using a preheating network, and training the preheating network based on a preheating loss function until convergence; freezing parameters of a preheating network, taking the parameters as initial values of a ternary network, calculating a task related prototype by using the ternary network and an attention mechanism, training the ternary network based on a ternary loss function, and optimizing an embedded network; and calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, wherein the class corresponding to the class with the highest similarity is the credit class of the user. The method combines the neural network and the metric learning, and can improve the accuracy and the robustness of model prediction under the condition of small samples.

Description

Small sample credit assessment method and system based on metric learning
Technical Field
The invention relates to the field of user credit evaluation, in particular to a small sample credit evaluation method and system based on metric learning.
Background
With the rapid development of internet finance, the network lending platform has gradually become the focus of public attention, attracts investors in various industries, and has very broad industry prospects. To ensure financial security, credit assessment of users is a vital loop that helps assess the risk of credit to the applicant and provides a basis for decision making at the issuing bank. Traditional machine learning methods such as logistic regression, decision tree, etc. models have not been able to handle increasingly complex user scenarios well.
The neural network-based method can accurately analyze various credit evaluation indexes, so that the neural network-based method is widely applied to the field of internet finance. The neural network-based method utilizes a deep learning algorithm, and simulates the working principle of the human brain through a multi-level neural network, so that accurate modeling and prediction of complex data are realized. This approach can handle large-scale non-linear data and automatically learn feature representations to better capture the correlation between user behavior and credit status. However, in a real environment, due to the restriction of manpower, material resources and objective factors, data acquisition is often very difficult, so that the sample size is sparse, i.e. a problem of small samples is faced, which seriously affects the performance of the traditional neural network method.
Metric learning is also an important learning method in machine learning, and mainly focuses on defining and learning similarity metrics between samples, and by effectively measuring the distance or similarity between samples, samples of different classes can be better distinguished. The similarity measure can capture the internal structure and characteristics in the data, and help to improve the performance of tasks such as classification, clustering and the like. However, conventional metric learning methods often rely on a large number of training samples, which typically require a large amount of training data to learn the similarity metric. In small sample learning, because of the limited number of samples per class, sufficient data may not be provided to support accurate learning of similarity metrics, which may result in a metric learning model that performs poorly in small sample scenarios.
Disclosure of Invention
The invention aims to: the invention aims to provide a small sample credit assessment method and a small sample credit assessment system based on measurement learning, which are combined with a neural network and measurement learning and can improve the learning performance of the small sample and the model prediction accuracy.
The technical scheme is as follows: the invention discloses a small sample credit evaluation method based on metric learning, which comprises the following steps:
(1) Analyzing and preprocessing data;
(2) Performing a small sample classification task on each epoode by using a task-based epoode training strategy, calculating a class-related prototype for the preheating network by using a neural network, and training the preheating network based on a preheating loss function until convergence;
(3) Freezing parameters of a preheating network, taking the parameters as initial values of a ternary network, calculating a task related prototype by using the ternary network and an attention mechanism, training the ternary network based on a ternary loss function, and optimizing an embedded network;
(4) And calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, wherein the class corresponding to the class with the highest similarity is the credit class of the user.
Preferably, the data analysis includes analyzing the user information into a plurality of credit indexes and performing quantization processing to form a user index vector, and splicing the user index vector with the user credit rating label to form a user behavior vector.
Preferably, in step (2), a small sample task T is performed for each epoode i ={S i ,Q i },T i For the ith subtask of the preheating stage S i For the support set of the ith subtask, Q i For the query set of the ith subtask, calculating a category correlation prototype of the category c in the small sample classification task as follows:
in the formula ,pc Representing sample class in feature space asClass-related prototypes of S i c Representing support set S in small sample classification tasks i Data set with category c, |S i c I represents S i c Size, x t For data set S i c User index vector, y of sample t in (b) t User credit rating labels for corresponding samples, f θ (∙) is a pre-heating network.
Preferably, in step (2), a ResNet12 neural network is used as the pre-heating network f θ (∙)。
Preferably, in step (2), constructing a preheating loss function to iteratively train the preheating network, wherein the preheating loss function L H The method comprises the following steps:
wherein ,xq For query set Q i User index vector, y of sample q q User grade labels for corresponding samples;for the class weight to be a class weight,is classified intoA normalized classification function;
for category c, the normalized classification function for category c is
Category weight for category c is,d c The sum of the distances between the class correlation prototype of the class c and other class-like samples; wherein x is q For query set Q i User index vector of middle sample q; s (∙) is a softmax function.
Preferably, the calculating task related prototypes in step (3) using the ternary network and the attention mechanism includes: calculating a category-related prototype using a ternary network; the category-related prototypes are converted into task-related prototypes using an attention mechanism.
Preferably, the class-related prototype p 'is calculated using a ternary network' c The calculation process of (1) is as follows:
in the formula ,gφ (∙) is a ternary network.
Preferably, the converting process of converting the category related prototype into the task related prototype by using the attention mechanism is as follows:
in the formula ,for a set of class-related prototypes for each class,for each class of task related prototype sets,task related prototypes for category c; a is a weight matrix of an attention mechanism;a learnable mapping parameter for a value space in an attention mechanism;a learning mapping parameter for a query space in an attention mechanism;a learnable mapping parameter for a key space in an attention mechanism; m is the spatial dimension of the mapping parameters; s (∙) is a softmax function;is a linear layer normalization function.
Preferably, in the step (3), constructing a ternary loss function to iteratively train the ternary network, wherein the ternary loss function L R The method comprises the following steps:
in the formula ,is a scaling coefficient; for categoriesTask related prototypes for category c;the same class positive sample with the same class of the credit rating of the user is used;different negative samples with different credit rating categories for the user; g φ (∙) is a ternary network;is the separation distance between the positive class sample and the negative class sample.
Preferably, in step (4), a similarity calculation formula between the user index vector of the unknown sample and each task related prototype is:
wherein ,a user index vector that is an unknown sample.
The invention relates to a small sample credit evaluation system based on metric learning, which comprises:
the data analysis and preprocessing module is used for analyzing the credit index of the user, carrying out quantitative assignment and preprocessing;
the preheating network training module is used for executing a small sample classification task on each epsilon by using a task-based epsilon training strategy, calculating a class-related prototype by using the preheating network, and training the preheating network based on a preheating loss function until convergence;
the embedded network optimization module is used for freezing parameters of the preheating network, taking the parameters as initial values of the ternary network, calculating a task related prototype by utilizing the ternary network and an attention mechanism, and training the ternary network based on a ternary loss function;
and the credit evaluation module is used for calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, and the class corresponding to the class with the highest similarity is the credit class of the user.
The beneficial effects are that: compared with the prior art, the invention has the following remarkable advantages: the preheating network module is adopted, so that the network can be better adapted to the task data of the small sample at the beginning, and the convergence speed and performance of the network are improved; the ternary network module is adopted to increase the intervals between different categories by utilizing the intervals between the positive type samples and the negative type samples based on measurement learning, so that the sample distribution space is more discriminative, the prediction accuracy is improved, and the confusion degree between different categories is reduced; the correlation between tasks can be captured by using the attention mechanism, so that information of different tasks can be better utilized in credit rating tasks, prediction accuracy is improved, and the network can weight the characteristics of different tasks by calculating task correlation prototypes and using the attention mechanism, so that information of different tasks can be better fused, and accuracy and robustness of model prediction under the condition of small samples are improved.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a method framework of the present invention;
FIG. 3 is a graph showing experimental results of different methods according to the embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings.
For the task of discriminating the credit rating of the user given the user information, it can be regarded as training an efficient classification model by using a training set, and then using the model to perform classification detection on the user information. As shown in FIG. 1, the small sample credit evaluation method based on metric learning comprises a data preprocessing stage, a model iterative training stage and a prediction evaluation stage. The method specifically comprises the following steps:
(1) A data preprocessing stage comprising:
(1.1) collecting a data set and analyzing the data set.
The dataset is made up of several records, each record including user information. And analyzing the user information into six major categories including personal basic information, personal work information, personal economic information, personal borrowing information, personal crime records and personal third party credit records, wherein each category is subdivided into a plurality of credit indexes and is subjected to quantization processing. The specific user credit index division is shown in table 1.
TABLE 1 user Credit index dividing Table
The quantitative processing was performed on the non-quantitative credit index in table 1, and the quantitative value of each credit index is shown in table 2.
Table 2 quantized values of credit indices
In this embodiment, for a piece of user information record, the values in table 3 below can be obtained.
Table 3 user information records and quantized values thereof in the embodiment
Specific credit index Index value Quantized value
Age of 26 26
Marital status Wedding 5
Education level Study and the above 5
Household book Eastern region 5
Housing condition Loan without house 5
Working properties External enterprise and marketing unit 5
Job position Auxiliary manager/technical expert 5
Service life of the vehicle 3 3
Total annual personal income 18w 18
Total annual income of family 30w 30
Money amount for borrowing 0 0
Period of borrowing 0 0
Record of crime Crime free recording 5
Whether to apply for bank loan Bank loan record without application 5
Whether to transact credit card Non-transacted credit card records 5
Credit card borrowing times 0 0
Credit card repayment times 0 0
Number of overdue credit card times 0 0
Credit rating a 6
(1.2) dirty data cleaning and missing value completion are performed on the quantized values of the user indexes, andand (5) normalization treatment. The processed 19 credit indexes are stored in a user index vector in sequenceThe user index vector is 1 x 19 in size, and each bit in the vector is a credit index attribute marking value.
And (1.3) splicing the user index vector x with the credit rating label y to form a user behavior vector, wherein the user behavior vector is 1 multiplied by 20, the first 19 bits represent 19 kinds of credit index attribute label values, and the last 1 bits of user credit rating label is the user credit rating label value. In this embodiment, the user credit rating label is 0, which indicates that the user credit is excellent.
(2) And in the model iterative training stage, the embedded network is optimized, and the process is shown in fig. 2. The method specifically comprises the following steps:
(2.1) calculating a class-related prototype using a preheat network, training the preheat network based on a preheat loss function until convergence.
(2.1.1) Using the preprocessed data as a training set, the training set as the input to the model, resNet12 is first randomly initialized as a pre-heating network f θ (∙) calculating a class-related prototype using a pre-heated network. The method comprises the following specific steps:
for each epoode, N classes are randomly extracted from a training set by using a task-based epoode training strategy, K samples are extracted from each class to form a support set S, then a part of data is extracted from the rest samples in the N classes to serve as a query set Q, the formed classification problem is called an N-way K-shot small sample task, and the whole training task is formed by a plurality of small sample tasks. Performing a small sample task T for each epoode i ={S i ,Q i },T i For the ith subtask of the preheating stage S i For the support set of the ith subtask, Q i For the query set of the ith subtask, a support set S in the small sample classification task is calculated i The class-related prototypes for class c in:
wherein ,pc Representing sample class in feature space asClass-related prototypes of S i c Representing support set S in small sample classification tasks i Data set with category c, |S i c The I represents the dataset S i c Size, x t For data set S i c User index vector, y of sample t in (b) t The user for the corresponding sample is credited with the rating label.
(2.1.2) calculating class weights according to the distances between the prototypes and the similar samples in each class.
The calculation formula of the sum of the distances between the original type and other similar samples in each category is as follows:
the weight of category cWherein s (∙) is a softmax function.
(2.1.3) for a query from the query set Q i New sample x in (2) q Obtaining a normalized classification function of each category c by using distance discrimination:
wherein ,xq For query set Q i User index vector of sample q.
(2.1.4) iteratively training the preheat network using a preheat loss function of the preheat network until the model converges.
Wherein, the preheating loss function L of the preheating network H The method comprises the following steps:
wherein ,xq For query set Q i User index vector, y of sample q q User grade labels for corresponding samples;is a category weight.
And (2.2) freezing parameters of the preheating network, taking the parameters as initial values of a ternary network, calculating a task related prototype by utilizing the ternary network and an attention mechanism, training the ternary network based on a ternary loss function until convergence, and optimizing an embedded network. The method specifically comprises the following steps:
(2.2.1) freezing parameters of the pre-heating network, and calculating a class-related prototype p 'using the ternary network with the parameters as initial values of the ternary network' c The calculation process of (1) is as follows:
wherein ,gφ (∙) is a ternary network; the ternary network module increases the intervals between different categories by utilizing the intervals between the positive type samples and the negative type samples based on metric learning, so that the sample distribution space is more discriminative, the prediction accuracy is improved, and the confusion among different categories is reduced.
(2.2.2) converting the category related prototypes into task related prototypes using an attention mechanism:
wherein ,for a set of class-related prototypes for each class,for each class of task related prototype sets,task related prototypes for category c; a is a weight matrix of an attention mechanism;a learnable mapping parameter for a value space in an attention mechanism;a learning mapping parameter for a query space in an attention mechanism;a learnable mapping parameter for a key space in an attention mechanism; m is the spatial dimension of the mapping parameters; s (∙) is a softmax function;is a linear layer normalization function.
(2.2.3) constructing a ternary loss function, and performing iterative training on the ternary network until the model converges.
Wherein the ternary loss function L R The method comprises the following steps:
in the formula ,is a scaling coefficient; for categoriesTask related prototypes for category c;the same class positive sample with the same class of the credit rating of the user is used;different negative samples with different credit rating categories for the user; g φ (∙) is a ternary network;is the separation distance between the positive class sample and the negative class sample.
(3) A predictive evaluation stage for calculating the similarity between the unknown query sample and the task related prototype of each categoryThe class with the highest similarity corresponds to the user credit class; the similarity calculation formula is as follows:
in the formula ,a user index vector that is an unknown query sample.
In order to further verify the method, the small sample credit evaluation method based on metric learning is verified through a simulation experiment. The model training method and the testing method are realized by using python, and are compared with CART decision tree algorithm, deep belief network DBN and HAM-UCE user credit evaluation method. And calculating under a 5-way 5-shot task by using the lending platform user information index data set disclosed by the Arian pond. All programs were run on standard servers equipped with Intel Core i7-8700CPU,3.20GHz,32GBRAM and NVIDIATITAN RTX. An optimized is set as Adam by using a res net12 neural network with an activation function of ReLu. In the model iterative training phase, 0.1 is used as the initial learning rate, and gradually decreases to one tenth of the original value in the training process. The classification accuracy Acc is used as an evaluation index to represent the proportion of the classifier that is correctly classified in all samples. The calculation formula of the classification accuracy is as follows:
the classification accuracy comparison results of the various methods are shown in fig. 3, wherein CART represents a CART decision tree algorithm, DBN represents a deep belief network algorithm, HAM-UCE represents a HAM-UCE user credit evaluation method, and Ours represents the method of the present invention. As can be seen from the figure 3, the classification recognition accuracy of the small sample credit evaluation method based on the metric learning is 80.34 percent, which is higher than that of other methods, shows the superiority of being more suitable for the special task of small sample learning, and obviously and efficiently improves the model performance.
The invention relates to a small sample credit evaluation system based on metric learning, which comprises:
the data analysis and preprocessing module is used for analyzing the credit index of the user, carrying out quantitative assignment and preprocessing;
the preheating network training module is used for executing a small sample classification task on each epsilon by using a task-based epsilon training strategy, calculating a class-related prototype by using the preheating network, and training the preheating network based on a preheating loss function until convergence;
the embedded network optimization module is used for freezing parameters of the preheating network, taking the parameters as initial values of the ternary network, calculating a task related prototype by utilizing the ternary network and an attention mechanism, and training the ternary network based on a ternary loss function;
and the credit evaluation module is used for calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, and the class corresponding to the class with the highest similarity is the credit class of the user.

Claims (10)

1. The small sample credit evaluation method based on metric learning is characterized by comprising the following steps of:
(1) Analyzing and preprocessing data;
(2) Performing a small sample classification task on each epoode by using a task-based epoode training strategy, calculating a class-related prototype by using a neural network as a pre-heating network, and training the pre-heating network based on a pre-heating loss function until convergence;
(3) Freezing parameters of a preheating network, taking the parameters as initial values of a ternary network, calculating a task related prototype by using the ternary network and an attention mechanism, training the ternary network based on a ternary loss function, and optimizing an embedded network;
(4) And calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, wherein the class corresponding to the class with the highest similarity is the credit class of the user.
2. The method for evaluating credit of small sample based on metric learning according to claim 1, wherein in step (1), the data analysis includes analyzing the user information into a plurality of credit indexes and performing quantization processing to form a user index vector, and splicing the user index vector with a user credit rating label to form a user behavior vector.
3. The method of claim 2, wherein in step (2), a small sample task T is performed for each epoode i ={S i ,Q i },T i For the ith subtask of the preheating stage S i For the support set of the ith subtask, Q i For the query set of the ith subtask, calculating a category correlation prototype of the category c in the small sample classification task as follows:
in the formula ,pc Representing sample categories in feature spaceIs thatClass-related prototypes of S i c Representing support set S in small sample classification tasks i Data set with category c, |S i c I represents S i c Size, x t For data set S i c User index vector, y of sample t in (b) t User credit rating labels for corresponding samples, f θ (∙) is a pre-heating network.
4. The method for evaluating small sample credit based on metric learning according to claim 3, wherein in step (2), a preheating loss function is constructed to iteratively train a preheating network, the preheating loss function L H The method comprises the following steps:
wherein ,xq For query set Q i User index vector, y of sample q q User grade labels for corresponding samples;is category weight->Normalizing the classification function for the category;
for category c, the normalized classification function for category c is
Category weight for category c is,d c The sum of the distances between the class correlation prototype of the class c and other class-like samples; which is a kind ofWherein x is q For query set Q i User index vector of middle sample q; s (∙) is a softmax function.
5. The method of claim 4, wherein the calculating task related prototypes in step (3) using a ternary network and an attention mechanism comprises: calculating a category-related prototype using a ternary network; the category-related prototypes are converted into task-related prototypes using an attention mechanism.
6. The method of claim 5, wherein the class correlation prototype p 'is calculated using a ternary network' c The calculation process of (1) is as follows:
in the formula ,gφ (∙) is a ternary network.
7. The method for evaluating credit of a small sample based on metric learning according to claim 6, wherein the converting process of converting the category-related prototype into the task-related prototype using the attention mechanism is as follows:
in the formula ,for each class, class-related prototype set +.>;/>For each class of task related prototype sets +.>,/>Task related prototypes for category c; a is a weight matrix of an attention mechanism; />A learnable mapping parameter for a value space in an attention mechanism; />A learning mapping parameter for a query space in an attention mechanism; />A learnable mapping parameter for a key space in an attention mechanism; m is the spatial dimension of the mapping parameters; s (∙) is a softmax function;is a linear layer normalization function.
8. The method for evaluating small sample credit based on metric learning according to claim 7, wherein in step (3), a ternary loss function is constructed to iteratively train a ternary network, the ternary loss function L R The method comprises the following steps:
in the formula ,is a scaling coefficient; for category->,/>Task related prototypes for category c; />The same class positive sample with the same class of the credit rating of the user is used; />Different negative samples with different credit rating categories for the user; g φ (∙) is a ternary network; />Is the separation distance between the positive class sample and the negative class sample.
9. The method of claim 8, wherein in step (4), a similarity calculation formula between the user index vector of the unknown sample and each task related prototype is:
wherein ,a user index vector that is an unknown sample.
10. A metric learning-based small sample credit assessment system, comprising:
the data analysis and preprocessing module is used for analyzing the credit index of the user, carrying out quantitative assignment and preprocessing;
the preheating network training module is used for executing a small sample classification task on each epsilon by using a task-based epsilon training strategy, calculating a class-related prototype by using the preheating network, and training the preheating network based on a preheating loss function until convergence;
the embedded network optimization module is used for freezing parameters of the preheating network, taking the parameters as initial values of the ternary network, calculating a task related prototype by utilizing the ternary network and an attention mechanism, and training the ternary network based on a ternary loss function;
and the credit evaluation module is used for calculating the similarity between the user index vector of the unknown sample and each task related prototype by using the trained embedded network, and the class corresponding to the class with the highest similarity is the credit class of the user.
CN202311148690.0A 2023-09-07 2023-09-07 Small sample credit assessment method and system based on metric learning Pending CN116883157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311148690.0A CN116883157A (en) 2023-09-07 2023-09-07 Small sample credit assessment method and system based on metric learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311148690.0A CN116883157A (en) 2023-09-07 2023-09-07 Small sample credit assessment method and system based on metric learning

Publications (1)

Publication Number Publication Date
CN116883157A true CN116883157A (en) 2023-10-13

Family

ID=88272138

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311148690.0A Pending CN116883157A (en) 2023-09-07 2023-09-07 Small sample credit assessment method and system based on metric learning

Country Status (1)

Country Link
CN (1) CN116883157A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557840A (en) * 2023-11-10 2024-02-13 中国矿业大学 Fundus lesion grading method based on small sample learning
CN117975171A (en) * 2024-03-29 2024-05-03 南京大数据集团有限公司 Multi-label learning method and system for incomplete and unbalanced labels

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117557840A (en) * 2023-11-10 2024-02-13 中国矿业大学 Fundus lesion grading method based on small sample learning
CN117557840B (en) * 2023-11-10 2024-05-24 中国矿业大学 Fundus lesion grading method based on small sample learning
CN117975171A (en) * 2024-03-29 2024-05-03 南京大数据集团有限公司 Multi-label learning method and system for incomplete and unbalanced labels

Similar Documents

Publication Publication Date Title
CN109255506B (en) Internet financial user loan overdue prediction method based on big data
CN112070125A (en) Prediction method of unbalanced data set based on isolated forest learning
Babenko et al. Classical machine learning methods in economics research: Macro and micro level example
CN106469560B (en) Voice emotion recognition method based on unsupervised domain adaptation
CN112015863B (en) Multi-feature fusion Chinese text classification method based on graphic neural network
CN116883157A (en) Small sample credit assessment method and system based on metric learning
CN111444342B (en) Short text classification method based on multiple weak supervision integration
CN117151870B (en) Portrait behavior analysis method and system based on guest group
CN117236647B (en) Post recruitment analysis method and system based on artificial intelligence
CN111681022A (en) Network platform data resource value evaluation method
Fan et al. Improved ML‐based technique for credit card scoring in Internet financial risk control
CN109033087B (en) Method for calculating text semantic distance, deduplication method, clustering method and device
CN112070543A (en) Method for detecting comment quality in E-commerce website
CN112836750A (en) System resource allocation method, device and equipment
CN113554310A (en) Enterprise credit dynamic evaluation model based on intelligent contract
CN117217807B (en) Bad asset estimation method based on multi-mode high-dimensional characteristics
Rofik et al. The Optimization of Credit Scoring Model Using Stacking Ensemble Learning and Oversampling Techniques
CN112508684B (en) Collecting-accelerating risk rating method and system based on joint convolutional neural network
Wang et al. Joint loan risk prediction based on deep learning‐optimized stacking model
Wu et al. Customer churn prediction for commercial banks using customer-value-weighted machine learning models
CN116128339A (en) Client credit evaluation method and device, storage medium and electronic equipment
CN114266394A (en) Enterprise portrait and scientific service personalized demand prediction method oriented to scientific service platform
CN113821571A (en) Food safety relation extraction method based on BERT and improved PCNN
CN116304058B (en) Method and device for identifying negative information of enterprise, electronic equipment and storage medium
Shen et al. Investment time series prediction using a hybrid model based on RBMs and pattern clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination