CN112819024A

CN112819024A - Model processing method, user data processing method and device and computer equipment

Info

Publication number: CN112819024A
Application number: CN202010662448.5A
Authority: CN
Inventors: 王波; 冯远豪; 黄文�; 董井然; 郑雪豪; 屈璐; 苏函晶; 陈守志; 王文瀚
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2021-05-18
Anticipated expiration: 2040-07-10
Also published as: CN112819024B

Abstract

The application relates to a model processing method, a user data processing method and device and computer equipment. The model processing method comprises the following steps: predicting each user data sample through a target model to obtain a prediction result, wherein each user data sample respectively and correspondingly has a first training label and a second training label, the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed; constructing a loss function according to each prediction result and the corresponding first training label and second training label; determining first index information and second index information based on the prediction result; iteratively training the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached; the target model is used for determining a corresponding response result of the user data. By adopting the method, the performance of the machine learning model can be improved.

Description

Model processing method, user data processing method and device and computer equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a model processing method, a user data processing method and apparatus, and a computer device.

Background

With the development of artificial intelligence, machine learning models are more and more widely used. For example, whether a user responds to push information is predicted by a machine learning model, and whether information is pushed to the user is determined.

However, in the use of the machine learning model, it is often impossible to balance more than one index, such as the response user predicted by the machine learning model, and may also be a user with poor credit, which results in poor performance of the model.

Disclosure of Invention

In view of the above, it is necessary to provide a model processing method, a user data processing method, an apparatus, and a computer device capable of improving model performance in view of the above technical problems.

A method of model processing, the method comprising:

acquiring a user data sample set and a target model; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed;

predicting each user data sample through a target model to obtain a prediction result;

constructing a loss function according to each prediction result and the corresponding first training label and second training label;

determining first index information and second index information based on the prediction result; the first index information is used for measuring the distinguishing capability of the target model on positive and negative samples of the first index; the second index information is used for measuring the distinguishing capacity of the target model for positive and negative samples of the second index;

iteratively training the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached; the target model is used for determining a corresponding response result of the user data.

A model processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring a user data sample set and a target model; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed;

the prediction module is used for predicting each user data sample through the target model to obtain a prediction result;

the construction module is used for constructing a loss function according to each prediction result and the corresponding first training label and second training label;

a determination module for determining first index information and second index information based on the prediction result; the first index information is used for measuring the distinguishing capability of the target model on positive and negative samples of the first index; the second index information is used for measuring the distinguishing capacity of the target model for positive and negative samples of the second index;

the training module is used for iteratively training the target model according to the loss function so as to iteratively maximize the difference between the first index information and the second index information in the training process until an iteration stop condition is reached; the target model is used for determining a corresponding response result of the user data.

In one embodiment, the build module is further to: constructing a first loss function based on the difference between each prediction result and the corresponding first training label, and constructing a second loss function based on the difference between each prediction result and the corresponding second training label; constructing a loss function according to the first loss function, the second loss function and the penalty coefficient corresponding to the second loss function;

a training module further to: iteratively training the target model according to the loss function, and updating model parameters of the target model; iteratively updating the penalty coefficient according to the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an updating process; and iteratively training the target model according to the updated loss function, and performing secondary model parameter updating on the target model.

In one embodiment, the training module is further configured to: fixing a penalty coefficient, and updating model parameters of the target model according to the direction of the minimized loss function; and continuously and circularly updating the model parameters until the training stopping condition is met, and finishing the training.

In one embodiment, the training module is further configured to: the first index information and the second index information are used as a reference, and the difference between the first index information and the second index information is represented through the increase amplitude of the first index information and the decrease amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively maximize the increase amplitude and the decrease amplitude in the updating process until the training stopping condition is met, and ending the training.

In one embodiment, the training module is further configured to: with the first index information and the second index information as a reference, representing a difference between the first index information and the second index information by a ratio between a descending amplitude of the first index information and a descending amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively minimize the ratio in the updating process until the training stop condition is met, and ending the training.

In one embodiment, the training module is further configured to: fixing the updated punishment coefficient, and performing secondary updating on the model parameters of the target model according to the direction of the loss function after the minimum updating; and continuously and circularly updating the model parameters for the second time until the training stopping condition is met, and finishing the training.

In one embodiment, the build module is further to: acquiring a regular penalty term; and constructing a loss function according to the first loss function, the second loss function, the penalty coefficients corresponding to the second loss function and the regular penalty items.

In one embodiment, the prediction module is further configured to: sequentially taking each user data sample as a current processing sample; extracting more than one feature data from a currently processed sample; determining a coding value corresponding to each of more than one feature data; generating characteristic values of the coding values according to the respective corresponding weights; and outputting a corresponding prediction result of the characteristic value through the target model.

In one embodiment, the determining module is further configured to: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of a current processing object under a first index; determining a first proportion of the number of positive samples in the total number of positive samples and a second proportion of the number of negative samples in the total number of negative samples; determining a difference value between the first proportion and the second proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as first index information.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the model processing method, the model processing device, the computer equipment and the storage medium, the target model is used for predicting each user data sample to obtain a prediction result, a loss function is built according to each prediction result and corresponding first training labels and second training labels, first index information and second index information are determined based on the prediction result, the target model is iteratively trained according to the loss function, the difference between the first index information and the second index information is iteratively maximized in the training process, the first training labels correspond to the first indexes to be optimized, the second training labels correspond to the second indexes to be suppressed, and therefore under the same modeling frame, the indexes to be optimized are fitted based on the thought of antagonistic training, and the indexes to be suppressed are suppressed at the same time, so that the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

A method of user data processing, the method comprising:

acquiring user data to be processed;

extracting more than one feature data from the user data to be processed;

obtaining a response result corresponding to the user data to be processed according to the more than one feature data through the target model;

the target model is obtained through iterative training according to a loss function, and the difference between the first index information and the second index information is iteratively maximized in the training process; the first index information is used for measuring the distinguishing capacity of the target model for the positive and negative samples of the first index to be optimized; the second index information is used for measuring the distinguishing capability of the target model for positive and negative samples of the second index to be suppressed.

A user data processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring user data to be processed;

the extraction module is used for extracting more than one feature data from the user data to be processed;

the processing module is used for obtaining a response result corresponding to the user data to be processed according to the more than one characteristic data through the target model;

In one embodiment, the response result is a score; the user data processing device also comprises a pushing module, and the pushing module is used for: acquiring more than one user data to be processed; obtaining corresponding scores of user data to be processed through the target model; sorting user identifications corresponding to the user data to be processed according to the scores from high to low; and selecting a preset number of user identifications from the sorting result to carry out information push.

acquiring user data to be processed;

extracting more than one feature data from the user data to be processed;

acquiring user data to be processed;

extracting more than one feature data from the user data to be processed;

According to the user data processing method, the user data processing device, the computer equipment and the storage medium, the user data to be processed is obtained, more than one characteristic data is extracted from the user data to be processed, the corresponding response result of the user data to be processed is obtained through the target model according to the more than one characteristic data, the target model is obtained through iterative training according to the loss function, the difference between the first index information and the second index information is iteratively maximized in the training process, the first index is an index to be optimized, the second index is an index to be suppressed, and therefore under the same modeling frame, the index to be optimized is fitted based on the thought of countertraining, and the index to be suppressed is suppressed at the same time, and therefore the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

Drawings

FIG. 1 is a diagram of an application environment of a model processing method in one embodiment;

FIG. 2 is a schematic flow chart diagram illustrating a method for model processing in one embodiment;

FIG. 3 is a graph illustrating the relationship between the response rate and the black and gray rate in one embodiment;

FIG. 4 is a schematic flow chart of feature data transformation in one embodiment;

FIG. 5 is a diagram illustrating updating penalty coefficients in one embodiment;

FIG. 6 is a graph of the relationship between the response rate and the black and gray rate in one embodiment;

FIG. 7 is a schematic flow chart diagram of a model processing method in another embodiment;

FIG. 8 is a flow diagram of training a target model in one embodiment;

FIG. 9 is a flowchart illustrating a method of processing user data according to one embodiment;

FIG. 10 is a block diagram showing the structure of a model processing apparatus according to an embodiment;

FIG. 11 is a block diagram showing an example of a configuration of a user data processing apparatus;

FIG. 12 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and applied in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical care, smart customer service, and the like.

Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data.

Cloud technology (Cloud technology) is based on a general term of network technology, information technology, integration technology, management platform technology, application technology and the like applied in a Cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.

The user data processing method provided by the embodiment of the application will be described below based on machine learning and cloud technology of an artificial intelligence technology.

The model processing method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The terminal 102 acquires a user data sample set and a target model, and uploads the acquired user data sample set and target model to the server 104, wherein the target model is used for determining a response result corresponding to user data, each user data sample in the user data sample set respectively and correspondingly has a first training label and a second training label, the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed; the server 104 predicts each user data sample through the target model to obtain a prediction result; the server 104 constructs a loss function according to each prediction result and the corresponding first training label and second training label; the server 104 determines first index information and second index information based on the prediction result, wherein the first index information is used for measuring the distinguishing capability of the target model on positive and negative samples of the first index, and the second index information is used for measuring the distinguishing capability of the target model on positive and negative samples of the second index; the server 104 iteratively trains the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information during the training process until an iteration stop condition is reached.

The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud storage, network services, cloud communication, big data, and an artificial intelligence platform. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

The user data processing method provided by the application can also be applied to the application environment shown in fig. 1. The terminal 102 acquires user data to be processed and uploads the user data to be processed to the server 104; the server 104 extracts more than one feature data from the user data to be processed; the server 104 obtains a response result corresponding to the user data to be processed according to the more than one feature data through the target model; the target model is obtained through iterative training according to a loss function, and the difference between first index information and second index information is maximized in an iterative mode in the training process, wherein the first index information is used for measuring the distinguishing capacity of the target model for the positive and negative samples of the first index to be optimized, and the second index information is used for measuring the distinguishing capacity of the target model for the positive and negative samples of the second index to be suppressed.

In one embodiment, as shown in fig. 2, a model processing method is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:

step 202, obtaining a user data sample set and a target model; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed; the target model is used for determining a corresponding response result of the user data.

Wherein the user data sample set is a data set used for training the target model. The user data is data reflecting the characteristics of the user.

In a particular embodiment, the user data may include user base data, user behavior data, and the like. User base data is data that reflects user base attributes, such as identity data, resource data, credit data, and the like. Identity data such as gender, age, school calendar, job, position, income, etc. Resource data such as assets, deposits, etc. Credit data such as loans, and the like. The user behavior data is data reflecting the behavior characteristics of the user, such as payment behavior data, social behavior data, browsing behavior data, and the like. Payment behavior data such as repayment behavior data, loan behavior data, investment behavior data, consumption behavior data, transfer behavior data, and the like. Social behavior data such as social session data, social message posting data, social message comment data, and the like. Browsing behavior data such as news browsing data, audio-video browsing data, commodity browsing data, and the like.

And the target model predicts whether the user responds to the target information or not based on the user data to obtain a response result. The target information may be information to be pushed. For example, it is predicted through the target model whether the user will click to see the information to be pushed, whether the user will receive the card to be pushed, whether the user will purchase the goods to be pushed, whether the user will purchase the financial product to be pushed, whether the user will use the business service to be pushed, and the like. The target users with high information response rate to be pushed are obtained through the screening of the target model, and then the target users are pushed directionally, so that the information conversion rate is improved.

In a particular embodiment, the target model may be a machine learning model established by linear regression, logistic regression, decision trees, and the like.

In a particular embodiment, the target model may be a scorecard model. The scoring card model is a generalized linear model of two classification variables commonly used in the fields of credit risk assessment, financial risk control and the like.

When the target model is trained, feature data for training the model needs to be screened from a user data sample, and the feature data enables the model to learn to predict whether the user responds to the target information. However, some feature data are not only important feature data of response indexes, but also important feature data of other indexes. Therefore, the target users obtained by model screening have high response rate to the target information and may be obviously represented on other indexes. This affects model performance, especially when the other indicators are those that require suppression.

Based on this, the influence of other indexes on the model prediction result is generally reduced by deleting the important feature data of other indexes and retaining the important feature data of the response indexes. However, the important characteristic data of other indexes and the important characteristic data of the response indexes are overlapped, so that the performance of the model is also reduced.

In the method and the device, the indexes to be optimized are fitted and the indexes to be inhibited are inhibited under the same modeling framework, so that the performance of the model is improved.

Wherein, the first index and the second index have a correlation relationship. Optionally, the first index and the second index are in positive correlation, that is, the first index is increased, and the second index is increased, but the first index is an index to be optimized, and the second index is an index to be inhibited.

In a specific embodiment, the first indicator is a response rate and the second indicator is a black and gray rate. The response rate is the proportion of users in the user set that respond to the target information. The black and gray rate is the proportion of users in the user set with poor credit. The user with poor credit may be a user with a credit value lower than the credit threshold value, or a user with bad records such as overdue record of repayment and bad account record of loan.

Specifically, each user data sample respectively corresponds to a first training label and a second training label, and the first training label corresponds to a first index to be optimized, such as a label of response or non-response; the second training label corresponds to a second index to be suppressed, such as a label as black gray or non-black gray.

Wherein, the response result may be a score, and the higher the score is, the more likely the user responds to the target information is.

Specifically, the scores of the users are determined through a target model, and the users are sorted from high to low according to the scores so as to select the user with the top rank as the target user. Referring to fig. 3, fig. 3 is a graph showing the relationship between the response rate and the black and gray rate in one embodiment. As can be seen from fig. 3, in the filtered target users (for example, the top 100 ten thousand users), not only the response rate is high, but also the black and gray rate is high.

In the application, in order to reduce the black and grey rate of a head response user, the indexes to be optimized and the indexes to be suppressed are modeled uniformly, the indexes to be optimized are trained as main targets, the indexes to be suppressed are added into a loss function as penalty items, and a target model is trained iteratively according to the loss function.

And step 204, predicting each user data sample through the target model to obtain a prediction result.

Specifically, the server inputs each user data sample into the target model, and predicts each user data sample through the target model to obtain a prediction result.

In one embodiment, step 204 includes: sequentially taking each user data sample as a current processing sample; extracting more than one feature data from a currently processed sample; determining a coding value corresponding to each of more than one feature data; generating characteristic values of the coding values according to the respective corresponding weights; and outputting a corresponding prediction result of the characteristic value through the target model.

Wherein the encoded value may be a WOE (Weight of Evidence) value. The characteristic data is subjected to WOE conversion, the prediction result of the logistic regression model can be converted into a standard score card format, namely, the logistic regression score is converted into a specific score, so that the response possibility of a user to target information can be reflected more intuitively by the prediction result.

Specifically, referring to fig. 4, fig. 4 is a schematic flow chart of feature data conversion in one embodiment. More than one feature data is extracted from a user data sample, WOE conversion is carried out on the more than one feature data respectively to obtain WOE values corresponding to the more than one feature data respectively, the WOE values generate feature values according to weights corresponding to the WOE values respectively, and a target model outputs a prediction result according to the feature values.

In a specific embodiment, taking a logistic regression model as an example, the target model can be represented by the following formula:

wherein p is a prediction result, x is a characteristic value, and both omega and a are model parameters.

Specifically, the feature data for training the model needs to be determined in advance to select strongly correlated feature data that distinguishes positive and negative samples of the index to be optimized. The feature selection may be performed by calculating a coefficient of the kini, an IV (Information Value) Value, and the like, or may be performed by using a model such as LASSO (Least absolute Regression and selection operator), LR (Logistic Regression), RF (random forest), and the like.

The feature selection is performed by taking the IV value as an example. Firstly, selecting characteristic data, and grouping user data samples based on different characteristic data. Taking the feature data as an age as an example, the user data samples are grouped based on different ages to obtain 0-10 years old group, 10-18 years old group, 18-35 years old group and the like. And respectively calculating the WOE value of each group, wherein the WOE value reflects the difference between the proportion of each group of negative samples to the positive samples and the proportion of the whole negative samples to the positive samples, so that the WOE value can be considered to reflect the influence of the characteristic data value on the training target. The IV value is calculated from the WOE value, which is actually a weighted sum of the WOE values, and eliminates errors due to differences in the number of packets. And sorting the characteristic data according to the IV value of each characteristic data, and selecting the characteristics from high to low.

In the embodiment, when the target model is trained, strong correlation characteristic data for distinguishing positive and negative samples of indexes to be optimized is selected, so that the model training effect is improved; and the WOE conversion is carried out on the characteristic data, so that the prediction result of the regression model is converted into a standard scoring card format, and the prediction result is more visual.

And step 206, constructing a loss function according to each prediction result and the corresponding first training label and second training label.

Specifically, the server constructs a loss function according to each prediction result and the corresponding first training label and second training label.

In one embodiment, step 206 includes: constructing a first loss function based on the difference between each prediction result and the corresponding first training label, and constructing a second loss function based on the difference between each prediction result and the corresponding second training label; and constructing a loss function according to the first loss function, the second loss function and the penalty coefficient corresponding to the second loss function.

Wherein the penalty factor acts on the second loss function.

Specifically, the loss function can be represented by the following formula:

wherein δ (ω, a) is a loss function, ρ (ω, a) is a first loss function, κ (ω, a) is a second loss function, r is a penalty coefficient, ω and a are both model parameters, and t is a set of user data sample number.

Specifically, the loss function may employ a standard cross-entropy loss function, a square loss function, a focalloss function, or the like.

In the embodiment, the index to be suppressed is used as a penalty term to be added into the loss function, so that the index to be optimized and the index to be suppressed are trained simultaneously under the same modeling frame.

In step 208, first index information and second index information are determined based on the prediction results.

Specifically, the server determines first index information and second index information based on the prediction result.

The first index information is used for measuring the distinguishing capacity of the target model for positive and negative samples of the first index; the second index information is used for measuring the distinguishing capability of the target model for positive and negative samples of the second index.

In a specific embodiment, the first index information and the second index information may be KS values (Kolmogorov-Smirnov, discriminative power index). The KS value is an indicator of the ability of the evaluation model to discriminate between positive and negative samples. The larger the KS value, the more discriminating the model is from positive and negative samples.

In one embodiment, the first index information is obtained by: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of a current processing object under a first index; determining a first proportion of the number of positive samples in the total number of positive samples and a second proportion of the number of negative samples in the total number of negative samples; determining a difference value between the first proportion and the second proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as first index information.

It will be appreciated that when comparing the differences to determine the maximum value, the absolute values of the differences are compared.

Specifically, the first index information may be calculated by the following formula:

wherein Good refers to positive sample, Bad refers to negative sample, Good_kIs the number of positive samples, Bad, when the prediction result is k_kThe number of negative samples, Good, when the prediction result is k_totalIs the total number of positive samples, Bad_totalIs the total number of negative samples.

In one embodiment, the second index information is obtained by: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of the current processing object under a second index; determining a third proportion of the number of positive samples in the total number of positive samples and a fourth proportion of the number of negative samples in the total number of negative samples; determining a difference value between the third proportion and the fourth proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as second index information.

Based on this, the first index information may evaluate the degree of separation of the model from the responding user and the non-responding user, and the second index information may evaluate the degree of separation of the model from the black-and-gray user and the non-black-and-gray user.

In this embodiment, the first index information and the second index information are constructed by using the KS value, and the penalty coefficient is updated by using the first index information and the second index information.

Step 210, iteratively training the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information in the training process until an iteration stop condition is reached.

Specifically, the server iteratively trains the target model according to the loss function, and iteratively maximizes a difference between the first index information and the second index information in the training process until an iteration stop condition is reached.

In one embodiment, step 210 includes: iteratively training the target model according to the loss function, and updating model parameters of the target model; iteratively updating the penalty coefficient according to the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an updating process; and iteratively training the target model according to the updated loss function, and performing secondary model parameter updating on the target model.

Specifically, the server updates model parameters of the target model by using a loss function; then, the server updates the penalty coefficient by using the first index information and the second index information, and iteratively maximizes the difference between the first index information and the second index information in the updating process; and secondly, the server updates the model parameters of the target model secondarily by using the updated loss function.

In a specific embodiment, referring to fig. 5, fig. 5 is a schematic diagram of updating the penalty coefficients in one embodiment. The difference between the first index information and the second index information may be maximized by increasing the first index information, decreasing the second index information, or keeping the first index information unchanged, decreasing the second index information, or decreasing both the first index information and the second index information, but the first index information has a lower decrease than the second index information.

In the application, the black and gray rate of the head response user needs to be reduced, so that the first index information is increased, the second index information is reduced, or the first index information is kept unchanged, the second index information is reduced, or the first index information and the second index information are simultaneously reduced, but the descending amplitude of the first index information is lower than that of the second index information, so that the recognition capability of the target model for the response user is improved, the recognition capability of the target model for the black and gray user is weakened, and the purpose of reducing the black and gray rate of the head response user is achieved.

Referring to fig. 6, fig. 6 is a graph of the relationship between the response rate and the black and gray rate in one embodiment. As can be seen from fig. 6, among the filtered target users (such as the top 100 ten thousand users), the response rate is high but the black-gray rate is low.

In this embodiment, under the same modeling framework, the to-be-optimized index is fitted based on the thought of the countermeasure training, and the to-be-suppressed index is suppressed, so that the model performance is improved.

It can be understood that the method provided by the present embodiment is not limited to the case where the first index is the response rate and the second index is the black and gray rate. Any model with the index to be optimized and the index to be suppressed existing at the same time can be trained by the method provided by the embodiment.

According to the model processing method, each user data sample is predicted through a target model to obtain a prediction result, a loss function is built according to each prediction result and corresponding first training labels and second training labels, first index information and second index information are determined based on the prediction result, the target model is iteratively trained according to the loss function, the difference between the first index information and the second index information is iteratively maximized in the training process, the first training labels correspond to first indexes to be optimized, the second training labels correspond to second indexes to be suppressed, and therefore under the same modeling frame, the indexes to be optimized are fitted based on the thought of countertraining, and the indexes to be suppressed are suppressed at the same time, and therefore the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

In one embodiment, iteratively training the target model according to a loss function, performing model parameter updates on the target model, comprises: fixing a penalty coefficient, and updating model parameters of the target model according to the direction of the minimized loss function; and continuously and circularly updating the model parameters until the training stopping condition is met, and finishing the training.

Specifically, model parameters and penalty coefficients of the target model are initialized. And then, acquiring a group of user data samples, and respectively inputting each user data sample into the target model to obtain a loss function. And then, fixing the penalty coefficient, and reversely propagating and updating the model parameters of the target model according to a gradient descent algorithm.

In the embodiment, the model of the target model is trained, the model parameters can be fixed subsequently, the penalty coefficient is updated, and the accuracy of the penalty coefficient is improved.

In one embodiment, iteratively updating the penalty factor based on the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an update process comprises: the first index information and the second index information are used as a reference, and the difference between the first index information and the second index information is represented through the increase amplitude of the first index information and the decrease amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively maximize the increase amplitude and the decrease amplitude in the updating process until the training stopping condition is met, and ending the training.

In a specific embodiment, when the first index information is increased and the second index information is decreased, the sum of the increase amplitude of the first index information and the decrease amplitude of the second index information, or the product of the increase amplitude and the decrease amplitude, may be obtained, and whether the difference between the first index information and the second index information is increased may be determined by whether the sum of the increase amplitude and the decrease amplitude, or the product of the increase amplitude and the decrease amplitude is increased.

In a specific embodiment, when the first index information is kept unchanged and the second index information is decreased, whether the difference between the first index information and the second index information is increased can be determined by whether the decrease range is increased.

In this embodiment, the increase range of the first index information and the decrease range of the second index information are maximized, so that the effects of optimizing the first index and suppressing the second index are achieved.

In one embodiment, iteratively updating the penalty factor based on the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an update process comprises: the first index information and the second index information are used as a reference, and the difference between the first index information and the second index information is represented by the ratio of the descending amplitude of the first index information to the descending amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively minimize the ratio in the updating process until the training stop condition is met, and ending the training.

In a specific embodiment, when the first index information is decreased and the second index information is decreased, a ratio between a decrease range of the first index information and a decrease range of the second index information may be obtained, and whether a difference between the first index information and the second index information is increased is determined by whether the ratio is decreased.

In a specific embodiment, when the first index information is decreased and the second index information is decreased at the same time, a ratio between a decrease range of the second index information and a decrease range of the first index information may be obtained, and whether a difference between the first index information and the second index information is increased is determined by whether the ratio is increased.

In one embodiment, iteratively training the target model according to the updated loss function, and performing a second model parameter update on the target model includes: fixing the updated punishment coefficient, and performing secondary updating on the model parameters of the target model according to the direction of the loss function after the minimum updating; and continuously and circularly updating the model parameters for the second time until the training stopping condition is met, and finishing the training.

Specifically, a group of user data samples are obtained, each user data sample is respectively input into a target model with updated model parameters, and an updated loss function is obtained by using an updated penalty coefficient. And then, fixing the updated punishment coefficient, and reversely propagating the model parameters of the secondarily updated target model according to the gradient descent algorithm.

In the embodiment, after the penalty coefficient is updated, the model parameter of the target model is updated secondarily by using the updated penalty coefficient and the user data sample, so that the accuracy of the model parameter is improved.

In one embodiment, constructing the loss function according to the first loss function, the second loss function and the penalty coefficients corresponding to the second loss function includes: acquiring a regular penalty term; and constructing a loss function according to the first loss function, the second loss function, the penalty coefficients corresponding to the second loss function and the regular penalty items.

Wherein a regular penalty term is used to prevent overfitting. The regularization penalty term may be an L1 regularization term, an L2 regularization term, or the like.

Specifically, the loss function can be represented by the following formula:

wherein δ (ω, a) is a loss function, ρ (ω, a) is a first loss function, κ (ω, a) is a second loss function, r is a penalty coefficient, ω and a are both model parameters, t is a set of user data sample number, and c is a regular penalty term.

In the embodiment, the regular punishment item is added in the loss function, so that overfitting is prevented, and the model training effect is improved.

In one embodiment, as shown in fig. 7, there is provided a model processing method including the steps of:

step 702, obtaining a user data sample set and a target model; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized and the second training label corresponds to a second index to be suppressed.

In a particular embodiment, step 702 includes: sequentially taking each user data sample as a current processing sample; extracting more than one feature data from a currently processed sample; determining a coding value corresponding to each of more than one feature data; generating characteristic values of the coding values according to the respective corresponding weights; and outputting a corresponding prediction result of the characteristic value through the target model.

In a specific embodiment, the first index information is obtained by: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of a current processing object under a first index; determining a first proportion of the number of positive samples in the total number of positive samples and a second proportion of the number of negative samples in the total number of negative samples; determining a difference value between the first proportion and the second proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as first index information.

In a specific embodiment, the second index information is obtained by: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of the current processing object under a second index; determining a third proportion of the number of positive samples in the total number of positive samples and a fourth proportion of the number of negative samples in the total number of negative samples; determining a difference value between the third proportion and the fourth proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as second index information.

And step 704, predicting each user data sample through the target model to obtain a prediction result.

Step 706, construct a first loss function based on the difference between each prediction result and the corresponding first training label, and construct a second loss function based on the difference between each prediction result and the corresponding second training label.

Step 708, obtaining a regular penalty term, and constructing a loss function according to the first loss function, the second loss function, and a penalty coefficient and the regular penalty term corresponding to the second loss function.

Step 710, determining first index information and second index information based on the prediction result.

Step 712, fixing the penalty coefficient, and updating model parameters of the target model according to the direction of the minimized loss function; and continuously and circularly updating the model parameters until the training stopping condition is met, and finishing the training.

And 714, iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively maximize the difference between the first index information and the second index information in the updating process, and ending the training until the training stopping condition is met.

In a specific embodiment, when the first index information is increased and the second index information is decreased, the sum of the increase amplitude of the first index information and the decrease amplitude of the second index information, or the product of the increase amplitude and the decrease amplitude, may be obtained, and whether the difference between the first index information and the second index information is increased may be determined by whether the sum of the increase amplitude and the decrease amplitude, or the product of the increase amplitude and the decrease amplitude is increased. When the first index information is kept unchanged and the second index information is reduced, whether the difference between the first index information and the second index information is increased or not can be judged by judging whether the descending amplitude is increased or not. When the first index information is reduced and the second index information is reduced at the same time, the ratio between the reduction range of the first index information and the reduction range of the second index information can be obtained, and whether the difference between the first index information and the second index information is increased or not is judged according to whether the ratio is reduced or not.

Step 716, fixing the updated punishment coefficient, and performing secondary model parameter updating on the target model according to the direction of the loss function after the minimum updating; and continuously and circularly updating the model parameters for the second time until the training stopping condition is met, and finishing the training.

In one particular embodiment, the model processing method may be applied to the field of credit. Referring to FIG. 8, FIG. 8 is a flow diagram illustrating training of a target model according to one embodiment. It can be seen that the feature data is extracted from the user data sample, the extracted feature data is spliced, derived, screened and converted, the target model is trained and evaluated, and after the evaluation is passed, the corresponding response result of the user data is determined through the target model. The feature splicing is to splice the existing feature data to generate new meaningful feature data. Feature derivation is the combination of existing feature data to generate new meaningful feature data. The characteristic screening is to select strong correlation characteristic data for distinguishing positive and negative samples of indexes to be optimized. The characteristic conversion can be WOE conversion, and the prediction result of the logistic regression model is converted into a standard scoring card format, so that the prediction result is more visual.

In the model processing method, under the same modeling framework, the indexes to be optimized are fitted based on the thought of the confrontation training, and the indexes to be inhibited are inhibited at the same time, so that the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

In one embodiment, as shown in fig. 9, a user data processing method is provided, which is described by taking the method as an example applied to the server 104 in fig. 1, and includes the following steps:

step 902, obtain the user data to be processed.

Wherein the user data is data reflecting characteristics of the user.

At step 904, more than one feature data is extracted from the user data to be processed.

And 906, obtaining a response result corresponding to the user data to be processed according to the more than one characteristic data through the target model.

Specifically, more than one feature data is extracted from the user data to be processed through the target model, the encoding values corresponding to the more than one feature data are determined, the encoding values generate the feature values according to the weights corresponding to the encoding values, and the corresponding response results of the feature values are output through the target model.

wherein p is the response result, x is the characteristic value, and ω and a are both model parameters.

In one embodiment, a sample set of user data and a target model are obtained; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed; the target model is used for determining a corresponding response result of the user data.

In one embodiment, each user data sample is predicted through the target model to obtain a prediction result.

In one embodiment, each user data sample is sequentially taken as a current processing sample; extracting more than one feature data from a currently processed sample; determining a coding value corresponding to each of more than one feature data; generating characteristic values of the coding values according to the respective corresponding weights; and outputting a corresponding prediction result of the characteristic value through the target model.

In one embodiment, a loss function is constructed from each prediction and the corresponding first and second training labels.

In one embodiment, a first loss function is constructed based on differences between each prediction result and a corresponding first training label, and a second loss function is constructed based on differences between each prediction result and a corresponding second training label; and constructing a loss function according to the first loss function, the second loss function and the penalty coefficient corresponding to the second loss function.

Wherein the penalty factor acts on the second loss function.

Specifically, the loss function can be represented by the following formula:

In one embodiment, the first index information and the second index information are determined based on the prediction results.

In one embodiment, the target model is iteratively trained according to a loss function to iteratively maximize a difference between the first metric information and the second metric information during the training process until an iteration stop condition is reached.

In one embodiment, the target model is iteratively trained according to a loss function, and model parameters of the target model are updated; iteratively updating the penalty coefficient according to the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an updating process; and iteratively training the target model according to the updated loss function, and performing secondary model parameter updating on the target model.

In the user data processing method, user data to be processed is obtained, more than one characteristic data is extracted from the user data to be processed, and a response result corresponding to the user data to be processed is obtained through a target model according to the more than one characteristic data, wherein the target model is obtained through iterative training according to a loss function, the difference between first index information and second index information is iteratively maximized in the training process, the first index is an index to be optimized, and the second index is an index to be suppressed, so that the index to be optimized is fitted based on the thought of countertraining under the same modeling frame, and the index to be suppressed is suppressed at the same time, and the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

In one embodiment, the response result is a score; the method further comprises the following steps: acquiring more than one user data to be processed; obtaining corresponding scores of user data to be processed through the target model; sorting user identifications corresponding to the user data to be processed according to the scores from high to low; and selecting a preset number of user identifications from the sorting result to carry out information push.

Wherein, the higher the score, the higher the probability that the user responds to the push information is, and the black and gray rate is low.

Specifically, the scores of the users are determined through the target model, and the users are sorted from high to low according to the scores so as to select the users with the top rank for information push.

In one embodiment, the push information may be information, coupons, merchandise, financial products, business services, and the like.

In the embodiment, the target user with high response rate of the information to be pushed is obtained by screening the target model, and then the target user is pushed directionally, so that the information conversion rate is improved.

It should be understood that, although the steps in the flowcharts of fig. 2, 7 and 9 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2, 7, and 9 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the steps or stages in other steps.

In one embodiment, as shown in fig. 10, there is provided a model processing apparatus, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes: an obtaining module 1002, a predicting module 1004, a constructing module 1006, a determining module 1008, and a training module 1010, wherein:

an obtaining module 1002, configured to obtain a user data sample set and a target model; a first training label and a second training label are respectively and correspondingly arranged on each user data sample in the user data sample set; the first training label corresponds to a first index to be optimized, and the second training label corresponds to a second index to be suppressed;

the prediction module 1004 is used for predicting each user data sample through the target model to obtain a prediction result;

a constructing module 1006, configured to construct a loss function according to each prediction result and the corresponding first training label and second training label;

a determination module 1008 for determining first index information and second index information based on the prediction result; the first index information is used for measuring the distinguishing capability of the target model on positive and negative samples of the first index; the second index information is used for measuring the distinguishing capacity of the target model for positive and negative samples of the second index;

a training module 1010, configured to iteratively train the target model according to a loss function, so as to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached; the target model is used for determining a corresponding response result of the user data.

In one embodiment, the building module 1006 is further configured to: constructing a first loss function based on the difference between each prediction result and the corresponding first training label, and constructing a second loss function based on the difference between each prediction result and the corresponding second training label; constructing a loss function according to the first loss function, the second loss function and the penalty coefficient corresponding to the second loss function;

a training module 1010 further configured to: iteratively training the target model according to the loss function, and updating model parameters of the target model; iteratively updating the penalty coefficient according to the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an updating process; and iteratively training the target model according to the updated loss function, and performing secondary model parameter updating on the target model.

In one embodiment, the training module 1010 is further configured to: fixing a penalty coefficient, and updating model parameters of the target model according to the direction of the minimized loss function; and continuously and circularly updating the model parameters until the training stopping condition is met, and finishing the training.

In one embodiment, the training module 1010 is further configured to: the first index information and the second index information are used as a reference, and the difference between the first index information and the second index information is represented through the increase amplitude of the first index information and the decrease amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively maximize the increase amplitude and the decrease amplitude in the updating process until the training stopping condition is met, and ending the training.

In one embodiment, the training module 1010 is further configured to: with the first index information and the second index information as a reference, representing a difference between the first index information and the second index information by a ratio between a descending amplitude of the first index information and a descending amplitude of the second index information; and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively minimize the ratio in the updating process until the training stop condition is met, and ending the training.

In one embodiment, the training module 1010 is further configured to: fixing the updated punishment coefficient, and performing secondary updating on the model parameters of the target model according to the direction of the loss function after the minimum updating; and continuously and circularly updating the model parameters for the second time until the training stopping condition is met, and finishing the training.

In one embodiment, the building module 1006 is further configured to: acquiring a regular penalty term; and constructing a loss function according to the first loss function, the second loss function, the penalty coefficients corresponding to the second loss function and the regular penalty items.

In one embodiment, the prediction module 1004 is further configured to: sequentially taking each user data sample as a current processing sample; extracting more than one feature data from a currently processed sample; determining a coding value corresponding to each of more than one feature data; generating characteristic values of the coding values according to the respective corresponding weights; and outputting a corresponding prediction result of the characteristic value through the target model.

In one embodiment, the determining module 1008 is further configured to: sequentially taking the prediction results as current processing objects; acquiring the number of positive samples and the number of negative samples of a current processing object under a first index; determining a first proportion of the number of positive samples in the total number of positive samples and a second proportion of the number of negative samples in the total number of negative samples; determining a difference value between the first proportion and the second proportion to obtain a corresponding difference value of the current processing object; and taking the maximum difference value in the difference values corresponding to the prediction results as first index information.

The model processing device predicts each user data sample through a target model to obtain a prediction result, constructs a loss function according to each prediction result and corresponding first training labels and second training labels, determines first index information and second index information based on the prediction result, iteratively trains the target model according to the loss function, and iteratively maximizes the difference between the first index information and the second index information in the training process, wherein the first training labels correspond to first indexes to be optimized, and the second training labels correspond to second indexes to be suppressed, so that under the same modeling frame, the indexes to be optimized are fitted based on the thought of antagonistic training, and the indexes to be suppressed are suppressed at the same time, thereby improving the performance of the model; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

In one embodiment, as shown in fig. 11, there is provided a user data processing apparatus, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes: an obtaining module 1102, an extracting module 1104 and a processing module 1106, wherein:

an obtaining module 1102, configured to obtain user data to be processed;

an extraction module 1104, configured to extract more than one feature data from the user data to be processed;

a processing module 1106, configured to obtain, according to the more than one feature data, a response result corresponding to the to-be-processed user data through the target model;

The user data processing device acquires user data to be processed, extracts more than one characteristic data from the user data to be processed, and obtains a response result corresponding to the user data to be processed according to the more than one characteristic data through the target model, wherein the target model is obtained through iterative training according to a loss function, the difference between first index information and second index information is iteratively maximized in the training process, the first index is an index to be optimized, and the second index is an index to be suppressed, so that the index to be optimized is fitted based on the thought of antagonistic training under the same modeling frame, and the index to be suppressed is suppressed at the same time, and the performance of the model is improved; and when the corresponding response result of the user data is determined through the target model, the influence of other indexes in the head response user can be reduced.

For specific limitations of the model processing apparatus and the user data processing apparatus, reference may be made to the above limitations of the model processing method and the user data processing method, which are not described herein again. The modules in the model processing device and the user data processing device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used to store model process data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a model processing method.

Those skilled in the art will appreciate that the architecture shown in fig. 12 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of model processing, the method comprising:

predicting each user data sample through the target model to obtain a prediction result;

iteratively training the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached; the target model is used for determining a response result corresponding to the user data.

2. The method of claim 1, wherein constructing a loss function based on each of the predictors and the corresponding first and second training labels comprises:

constructing a first loss function based on a difference between each of the prediction results and a corresponding first training label, and constructing a second loss function based on a difference between each of the prediction results and a corresponding second training label;

constructing the loss function according to the first loss function, the second loss function and penalty coefficients corresponding to the second loss function;

iteratively training the target model according to the loss function to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached, including:

iteratively training the target model according to the loss function, and updating model parameters of the target model;

iteratively updating the penalty coefficient according to the first index information and the second index information to iteratively maximize a difference between the first index information and the second index information in an updating process;

and iteratively training the target model according to the updated loss function, and performing secondary model parameter updating on the target model.

3. The method of claim 2, wherein iteratively training the target model according to the loss function, performing model parameter updates on the target model, comprises:

fixing the penalty coefficient, and updating model parameters of the target model according to the direction of minimizing the loss function;

and continuously and circularly updating the model parameters until the training stopping condition is met, and finishing the training.

4. The method of claim 2, wherein iteratively updating the penalty factor based on the first metric information and the second metric information to iteratively maximize a difference between the first metric information and the second metric information in an update process comprises:

with the first index information and the second index information as a reference, representing a difference between the first index information and the second index information through a growing amplitude of the first index information and a descending amplitude of the second index information;

and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively maximize the increase amplitude and the decrease amplitude in the updating process until the training stopping condition is met, and ending the training.

5. The method of claim 2, wherein iteratively updating the penalty factor based on the first metric information and the second metric information to iteratively maximize a difference between the first metric information and the second metric information in an update process comprises:

with the first index information and the second index information as a reference, representing a difference between the first index information and the second index information by a ratio between a descending amplitude of the first index information and a descending amplitude of the second index information;

and iteratively updating the penalty coefficient by adopting the first index information and the second index information so as to iteratively minimize the ratio in the updating process until the training stop condition is met, and ending the training.

6. The method of claim 2, wherein iteratively training the target model according to the updated loss function, performing a second model parameter update on the target model, comprises:

fixing the updated punishment coefficient, and performing secondary model parameter updating on the target model according to the direction of the loss function after the minimum updating;

and continuously and circularly updating the model parameters for the second time until the training stopping condition is met, and finishing the training.

7. The method of claim 2, wherein constructing the loss function according to the first loss function, the second loss function, and penalty coefficients corresponding to the second loss function comprises:

acquiring a regular penalty term;

and constructing the loss function according to the first loss function, the second loss function, penalty coefficients corresponding to the second loss function and the regular penalty term.

8. The method of claim 1, wherein predicting each of the user data samples by the objective model to obtain a prediction result comprises:

sequentially taking each user data sample as a current processing sample;

extracting more than one feature data from the currently processed sample;

determining a coding value corresponding to each of the more than one feature data;

generating characteristic values of the coded values according to the respective corresponding weights;

and outputting the corresponding prediction result of the characteristic value through the target model.

9. The method of claim 1, wherein the first indicator information is obtained by:

sequentially using the prediction results as current processing objects;

acquiring the number of positive samples and the number of negative samples of the current processing object under the first index;

determining a first proportion of the number of positive samples in the total number of positive samples and a second proportion of the number of negative samples in the total number of negative samples;

determining a difference value between the first proportion and the second proportion to obtain a corresponding difference value of the current processing object;

and taking the largest difference value in the difference values corresponding to the prediction results as the first index information.

10. A method of processing user data, the method comprising:

acquiring user data to be processed;

extracting more than one feature data from the user data to be processed;

obtaining a response result corresponding to the user data to be processed according to the more than one feature data through a target model;

the target model is obtained through iterative training according to a loss function, and the difference between the first index information and the second index information is iteratively maximized in the training process; the first index information is used for measuring the distinguishing capacity of the target model for the positive and negative samples of the first index to be optimized; and the second index information is used for measuring the distinguishing capability of the target model for positive and negative samples of the second index to be suppressed.

11. The method of claim 10, wherein the response result is a score;

the method further comprises the following steps:

acquiring more than one user data to be processed;

obtaining corresponding scores of the user data to be processed through the target model;

sorting the user identifications corresponding to the user data to be processed according to the scores from high to low;

and selecting a preset number of user identifications from the sorting result to carry out information push.

12. A model processing apparatus, characterized in that the apparatus comprises:

a training module, configured to iteratively train the target model according to the loss function, so as to iteratively maximize a difference between the first index information and the second index information in a training process until an iteration stop condition is reached; the target model is used for determining a response result corresponding to the user data.

13. A user data processing apparatus, characterized in that the apparatus comprises:

the acquisition module is used for acquiring user data to be processed;

14. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 11 when executing the computer program.

15. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 11.