CN112102011A - User grade prediction method, device, terminal and medium based on artificial intelligence - Google Patents

User grade prediction method, device, terminal and medium based on artificial intelligence Download PDF

Info

Publication number
CN112102011A
CN112102011A CN202011092932.5A CN202011092932A CN112102011A CN 112102011 A CN112102011 A CN 112102011A CN 202011092932 A CN202011092932 A CN 202011092932A CN 112102011 A CN112102011 A CN 112102011A
Authority
CN
China
Prior art keywords
user
layer
prediction
full
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011092932.5A
Other languages
Chinese (zh)
Inventor
吴志成
张莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011092932.5A priority Critical patent/CN112102011A/en
Priority to PCT/CN2020/131955 priority patent/WO2021139432A1/en
Publication of CN112102011A publication Critical patent/CN112102011A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and provides a user grade prediction method, a device, a terminal and a medium based on artificial intelligence, wherein the method comprises the following steps: calculating the saturation and the correlation of each data index; extracting a plurality of module-entering data indexes from the plurality of data indexes according to the saturation and the correlation and inputting the module-entering data indexes into a first layer of input layer in a preset neural network framework; grouping all nodes of a current layer full-link layer according to a preset grouping rule, determining a target node in each group, and performing full-link training on a next layer full-link layer by using a plurality of target nodes of the current layer full-link layer until the training of the last layer full-link layer is finished; iteratively training a preset neural network frame according to a prediction grade label output by the last output layer of the preset neural network frame to obtain a user grade prediction model; and carrying out grade prediction on the target user by using a user grade prediction model. The invention can improve the efficiency of user grade prediction and the accuracy of the user grade prediction.

Description

User grade prediction method, device, terminal and medium based on artificial intelligence
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a user grade prediction method, a device, a terminal and a medium based on artificial intelligence.
Background
The grade is an index used by an insurance company for evaluating an insurance agent, and is evaluated according to the previous month performance of the insurance agent at the beginning of each month. If the insurance agent can be predicted to be increased by one grade in the next month according to the performance of the insurance agent in the current month, the enthusiasm of the insurance agent can be improved, the insurance company can be helped to plan the overall sales target, and the overall performance of the insurance company is improved.
In the prior art, a machine learning model is trained to predict whether a user can upgrade to a level, for example, to predict whether a non-diamond insurance agent can upgrade to a diamond insurance agent. However, the inventor finds that the number of data indexes of the user is as many as ten thousands, and the machine learning model is trained by using the data indexes, so that the training time is longer, and the user grade prediction efficiency is lower; and some useless data indexes also influence the learning precision of the machine learning model, so that the user level prediction effect is poor.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, an apparatus, a terminal and a medium for predicting a user rating based on artificial intelligence, which can improve the efficiency of the user rating prediction and the accuracy of the user rating prediction.
A first aspect of the present invention provides a method for user level prediction based on artificial intelligence, the method comprising:
the method comprises the steps of obtaining a plurality of data indexes of a plurality of users, obtaining user level labels of the users, calculating the saturation of each data index and calculating the correlation of each data index;
extracting a plurality of modelled data indexes from the plurality of data indexes of each user according to the saturation and the relevance;
inputting a plurality of in-mold data indexes of a plurality of users and the user grade labels to a first input layer in a preset neural network framework, wherein the preset neural network framework further comprises a plurality of full-connection layers and a last output layer;
acquiring all nodes of a current layer full-link layer, grouping all nodes of the current layer full-link layer according to a preset grouping rule and determining a target node in each group, and performing full-link training on a next layer full-link layer of the current layer by using a plurality of target nodes of the current layer full-link layer until the full-link training on the last layer full-link layer is completed;
obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model;
and performing grade prediction on the target user by using the user grade prediction model.
According to an optional embodiment of the present invention, the obtaining all nodes of the current layer full connection layer, grouping all nodes of the current layer full connection layer according to a preset grouping rule, and determining a target node in each group includes:
acquiring first node values of all nodes in a current layer full-connection layer;
grouping all the nodes in the current layer full-connection layer by adopting a step-by-step heuristic grouping method, and determining a node corresponding to the maximum first node value in each group in each heuristic process as a target node;
performing full connection training on a next full connection layer of the current layer by using the target node;
acquiring second node values of all nodes in the next full-connection layer after full-connection training;
for each heuristic grouping, calculating a loss entropy between the first node value and the second node value;
and determining the heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all the nodes in the current layer full-connection layer by using the target heuristic grouping method.
According to an alternative embodiment of the present invention, said performing the level prediction on the target user by using the user level prediction model comprises:
calculating the recall rate of the user level prediction model and determining a target probability threshold according to the recall rate;
obtaining a prediction index of the target user and inputting the prediction index into the user grade prediction model for prediction to obtain a prediction probability;
comparing the predicted probability to the target probability threshold;
when the prediction probability is greater than the target probability threshold, determining that the target user is in a first grade;
and when the prediction probability is smaller than or equal to the target probability threshold, determining that the target user is in a second level.
According to an alternative embodiment of the present invention, said calculating a recall rate of said user level prediction model and determining a target probability threshold based on said recall rate comprises:
defining a plurality of candidate probability threshold values by adopting a difference method;
aiming at each candidate probability threshold, calculating a recall rate according to a prediction grade label output by the user grade prediction model and a corresponding user grade label;
and determining the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
According to an alternative embodiment of the present invention, the calculating the saturation of each data index comprises:
traversing a plurality of characteristic values of each data index;
calculating a first number of characteristic values matched with a preset characteristic value in the plurality of characteristic values, and calculating the loss rate of the data index according to the first number;
calculating a second number of the plurality of feature values having the same feature value, and calculating a repetition rate of the data index according to the second number;
and calculating the saturation of the data index according to the missing rate and the repetition rate.
According to an alternative embodiment of the present invention, the calculating the degree of correlation of each data index comprises:
generating an eigenvalue vector according to the plurality of eigenvalues of each data index;
generating a grade label vector according to the user grade labels of the plurality of historical users;
calculating a Pearson coefficient between the eigenvalue vector and the level label vector;
determining the Pearson coefficient as a degree of correlation of the data indicator.
According to an optional embodiment of the present invention, the extracting a plurality of modelled data indexes from the plurality of data indexes of each user according to the saturation and the correlation comprises:
acquiring a plurality of first data indexes corresponding to saturation greater than a preset saturation threshold from the plurality of data indexes;
acquiring a plurality of second data indexes corresponding to the correlation degree smaller than a preset correlation degree threshold value from the plurality of first data indexes;
deriving a plurality of high-order data indicators according to the plurality of second data indicators;
and taking the plurality of second data indexes and the plurality of high-order data indexes as the mode-entering data indexes.
A second aspect of the present invention provides an artificial intelligence based user level prediction apparatus, the apparatus comprising:
the calculation module is used for acquiring a plurality of data indexes of a plurality of users, acquiring user level labels of the plurality of users, calculating the saturation of each data index and calculating the correlation of each data index;
the extraction module is used for extracting a plurality of input data indexes from the plurality of data indexes of each user according to the saturation and the correlation;
the input module is used for inputting a plurality of module entering data indexes of a plurality of users and the user grade labels to a first input layer in a preset neural network framework, wherein the preset neural network framework further comprises a plurality of layers of full connection layers and a last output layer;
the grouping module is used for acquiring all nodes of a current layer full-link layer, grouping all nodes of the current layer full-link layer according to a preset grouping rule and determining a target node in each group, and performing full-link training on a next layer full-link layer of the current layer by using a plurality of target nodes of the current layer full-link layer until the full-link training on the last layer full-link layer is completed;
the training module is used for acquiring a prediction grade label output by the last output layer of the preset neural network framework and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model;
and the prediction module is used for carrying out grade prediction on the target user by using the user grade prediction model.
A third aspect of the invention provides a terminal comprising a processor for implementing the artificial intelligence based user level prediction method when executing a computer program stored in a memory.
A fourth aspect of the present invention provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the artificial intelligence based user grade prediction method.
In summary, according to the user class prediction method, apparatus, terminal and medium based on artificial intelligence, after obtaining a plurality of data indexes of a plurality of users and obtaining user class labels of the plurality of users, the saturation and the relevance of each data index are calculated, and a plurality of modelled data indexes are extracted from the plurality of data indexes according to the saturation and the relevance, so that the efficiency of training a user class prediction model based on the plurality of modelled data indexes is improved because the data amount of the plurality of modelled data indexes is much smaller than that of the plurality of data indexes; the method comprises the steps that a plurality of model entering data indexes and corresponding user grade labels are input into a first input layer in a preset neural network framework, all nodes of each layer are gradually obtained and grouped according to a preset grouping rule, a target node in each group is determined, a plurality of target nodes of each layer of full connection layer are used for full connection training of the next layer of full connection layer, the number of nodes participating in transmission in each layer of full connection layer is reduced through grouping, the calculated amount of the neural network is reduced, and the efficiency of training a user grade prediction model is further improved; finally, obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model; and the user grade prediction model is used for carrying out grade prediction on the target user, so that the accuracy of the grade prediction of the user can be improved.
Drawings
Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence according to an embodiment of the present invention.
Fig. 2 is a block diagram of an artificial intelligence based user level prediction apparatus according to a second embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a terminal according to a third embodiment of the present invention.
Detailed Description
In order that the above objects, features and advantages of the present invention can be more clearly understood, a detailed description of the present invention will be given below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the present invention and features of the embodiments may be combined with each other without conflict.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
The user grade prediction method based on the artificial intelligence is executed by the terminal, and accordingly, the user grade prediction device based on the artificial intelligence operates in the terminal.
Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence according to an embodiment of the present invention. The user grade prediction method based on artificial intelligence specifically comprises the following steps, and the sequence of the steps in the flow chart can be changed and some steps can be omitted according to different requirements.
S11, obtaining a plurality of data indexes of a plurality of users and obtaining user level labels of the users, and calculating the saturation of each data index and the correlation of each data index.
Wherein the plurality of data metrics may include, but is not limited to: age, gender, income achievement, buried point behavior, etc. Each user corresponds to a plurality of data indexes and user grade labels.
Wherein, the user may be an insurance agent or a company salesperson, and the user rating label may include: a first level and a second level. Wherein the first level is higher than the second level. Illustratively, the first grade is a diamond grade and the second grade is a non-diamond grade.
Because the data indexes of each user can be thousands or even tens of thousands, the training time of the user grade prediction model is longer when the user grade prediction model is trained by using the data indexes, and therefore, the data indexes suitable for entering the model are screened out by calculating the saturation of each data index and calculating the correlation of each data index, so that the training time of the user grade prediction model is shortened, the training efficiency of the user grade prediction model is improved, and the efficiency of user grade prediction is improved.
In an alternative embodiment, the calculating the saturation of each data index includes:
traversing a plurality of characteristic values of each data index;
calculating a first number of characteristic values matched with a preset characteristic value in the plurality of characteristic values, and calculating the loss rate of the data index according to the first number;
calculating a second number of the plurality of feature values having the same feature value, and calculating a repetition rate of the data index according to the second number;
and calculating the saturation of the data index according to the missing rate and the repetition rate.
The characteristic values of the same data index of different users may be different or the same. For example, the sex data index of the partial user is female, and the sex data index of the partial user is male. As another example, the age data index, a characteristic value of the user's age data index, may be distributed between 18-60 years of age.
Generally speaking, the number of eigenvalues of the same data index should be as many as there are users. However, in practical applications, a missing or missing phenomenon may occur when data of a user is collected, so that characteristic values of some data indexes of some users are null. Calculating the ratio of the first number of empty eigenvalues to the number of users in a certain data index can determine the loss rate of the data index. Calculating the ratio of the second number of identical feature values in a data index to the number of users can determine the repetition rate of the data index.
After the missing rate and the repetition rate of the data index are calculated, calculating a first product between the missing rate and a preset first weight, and calculating a second product between the repetition rate and a preset second weight; and finally, calculating the sum of the first product and the second product to obtain the saturation of the data index. The sum of the preset first weight and the preset second weight is 1, and the preset first weight is smaller than the preset second weight.
In an optional embodiment, the calculating the degree of correlation of each data index includes:
generating an eigenvalue vector according to the plurality of eigenvalues of each data index;
generating a grade label vector according to the user grade labels of the plurality of historical users;
calculating a Pearson coefficient between the eigenvalue vector and the level label vector;
determining the Pearson coefficient as a degree of correlation of the data indicator.
Whether the data index is associated with the first grade or not is represented by calculating a Pearson coefficient between the characteristic value vector of the data index and the grade label vector, and the data index suitable for entering the mode is conveniently selected according to the Pearson coefficient.
And S12, extracting a plurality of modulus data indexes from the plurality of data indexes of each user according to the saturation and the correlation.
In an optional embodiment, the extracting a plurality of modulo data indicators from the plurality of data indicators of each user according to the saturation and the correlation comprises:
acquiring a plurality of first data indexes corresponding to saturation greater than a preset saturation threshold from the plurality of data indexes;
acquiring a plurality of second data indexes corresponding to the correlation degree smaller than a preset correlation degree threshold value from the plurality of first data indexes;
deriving a plurality of high-order data indicators according to the plurality of second data indicators;
and taking the plurality of second data indexes and the plurality of high-order data indexes as the mode-entering data indexes.
For data indexes with too large missing rate, the neural network cannot learn the characteristics of the corresponding data indexes. Therefore, the data indexes corresponding to the saturation degree smaller than or equal to the preset saturation degree threshold value are removed, the data indexes corresponding to the saturation degree larger than the preset saturation degree threshold value are reserved, the number of the data indexes of the user grade prediction model entering the model is reduced, useless data indexes are removed, noise data are reduced, and the learning effect of the user grade prediction model is improved.
If all the characteristic values corresponding to the age data index are 20, or all the characteristic values corresponding to the gender data index are female or male, the age data index or the gender data index has no learning significance for the user grade prediction model. Therefore, the data indexes corresponding to the correlation degree larger than or equal to the preset correlation degree threshold value are removed, and the data indexes corresponding to the saturation degree smaller than the preset saturation degree threshold value are reserved, so that the number of the data indexes of the user level prediction model entering the model can be further reduced, the noise data can be further reduced, and the learning effect of the user level prediction model can be improved.
And S13, inputting a plurality of mould entering data indexes of the users and the user grade labels to a first layer input layer in a preset neural network framework.
A convolutional neural network may be obtained as a pre-set neural network framework. The convolutional neural network comprises a first layer of input layer, a plurality of full-connection layers and a last layer of output layer. The first layer input layer is connected with the first layer full connecting layer, the first layer full connecting layer is connected with the second layer full connecting layer, the second layer full connecting layer is connected with the third layer full connecting layer, and the like, and the last layer full connecting layer is connected with the last layer output layer. The number of full connection layers can be set according to actual conditions.
And taking the characteristic value of the input data index of each user and the corresponding user grade label as a data pair, constructing a data set according to the data pairs of a plurality of users, and dividing the data set into a training data set and a testing data set according to the input time of the users. And inputting a training data set to a first layer input layer in a preset neural network framework for learning and training.
S14, acquiring all nodes of the current layer full link layer, grouping all nodes of the current layer full link layer according to a preset grouping rule and determining a target node in each group, and performing full link training on the next layer full link layer of the current layer by using a plurality of target nodes of the current layer full link layer until the full link training on the last layer full link layer is completed.
The number of nodes in the same group can be selected according to the calculation speed to be achieved by the user level prediction model, wherein the faster the calculation speed of the user level prediction model is, the more the number of nodes in the same group is, and the slower the calculation speed of the user level prediction model is, the less the number of nodes in the same group is. Calculating the target node and a preset weight, and transmitting the result to the next full-connection layer; and aiming at the non-target nodes, the operation is not carried out on the non-target nodes and the preset weight, and the non-target nodes are not transmitted to the next full-connection layer.
However, in order to avoid excessive grouping and reduce the prediction accuracy of the level prediction model, the number of nodes in the same group needs to be determined by stepwise heuristics, so as to improve the training efficiency of the user level prediction model and improve the prediction accuracy of the level prediction model.
In an optional embodiment, the obtaining all nodes of the current layer full connection layer, grouping all nodes of the current layer full connection layer according to a preset grouping rule, and determining a target node in each group includes:
acquiring first node values of all nodes in a current layer full-connection layer;
grouping all the nodes in the current layer full-connection layer by adopting a step-by-step heuristic grouping method, and determining a node corresponding to the maximum first node value in each group in each heuristic process as a target node;
acquiring second node values of all nodes in the next full-connection layer after full-connection training;
for each heuristic grouping, calculating a loss entropy between the first node value and the second node value;
and determining the heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all the nodes in the current layer full-connection layer by using the target heuristic grouping method.
The grouping can be progressively explored according to the positions of all nodes of each layer of the full-connection layer. Grouping the first layer of full connection layers by adopting a step-by-step heuristic grouping method, grouping the second layer of full connection layers by adopting a step-by-step heuristic grouping method, and so on, and grouping the last layer of full connection layers by adopting a step-by-step heuristic grouping method.
The process of grouping each layer of the fully-connected layer by adopting the step-by-step heuristic grouping method comprises the following steps:
grouping two nodes which are close together into the same group by first probing;
grouping the three nodes which are close together into the same group by the second trial grouping;
……;
and (3) grouping the N +1 nodes which are close together into the same group by the nth trial grouping.
After each trial grouping, calculating the distance between a first node value of the current layer full-connection layer and a second node value of a next layer full-connection layer of the current layer full-connection layer to obtain the loss entropy. The larger the loss entropy is, the more the characteristic loss is caused after all the nodes after grouping are transmitted to the next full-connection layer, and the smaller the loss entropy is, the less the characteristic loss is caused after all the nodes after grouping are transmitted to the next full-connection layer.
Selecting a heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all the nodes in the current layer full-connection layer by using the target heuristic grouping method, so that the characteristics of all the grouped nodes are not lost after the nodes are transmitted to the next layer full-connection layer; and the number of target nodes participating in the transmission is greatly reduced compared with the number of nodes before grouping, so that the calculation amount of each full-connection layer can be reduced, and the training speed and efficiency of the whole user level prediction model are improved.
And S15, obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model.
And calculating according to the prediction grade label and the user grade label to obtain a test passing rate, when the test passing rate is smaller than a preset passing rate threshold value, reclassifying the data set into a new training data set and a new test data set, using the new training data set to retrain the user grade prediction model, using the new test data set to retest the test passing rate of the user grade prediction model until the test passing rate is larger than the preset passing rate threshold value, and informing the user of the training of the user grade prediction model.
And S16, performing grade prediction on the target user by using the user grade prediction model.
In order to meet the consistency of the user grade prediction model on the parameter entering requirements in the training stage and the prediction stage, a target data index corresponding to the parameter entering data index is obtained from a plurality of data indexes of a target user in the current month, and the characteristic value of the target data index is output to the user grade prediction model to be predicted to obtain a probability value. The probability value is used to represent the likelihood that the target user can be promoted to the first level in the next month of the current month.
In an optional embodiment, the performing level prediction on the target user by using the user level prediction model includes:
calculating the recall rate of the user level prediction model and determining a target probability threshold according to the recall rate;
obtaining a prediction index of the target user and inputting the prediction index into the user grade prediction model for prediction to obtain a prediction probability;
comparing the predicted probability to the target probability threshold;
when the prediction probability is greater than the target probability threshold, determining that the target user is in a first grade;
and when the prediction probability is smaller than or equal to the target probability threshold, determining that the target user is in a second level.
And the target probability threshold is calculated according to the recall rate of the user level prediction model. In the prediction stage, the higher the prediction probability output by the user level prediction model is, the higher the possibility that the target user can be lifted to the first level is (for example, the higher the possibility that the target user can be lifted); the smaller the prediction probability output by the user level prediction model is, the greater the likelihood that the target user cannot be raised to the first level (e.g., the less likely it is to be able to raise the drill).
In an optional embodiment, said calculating a recall rate of said user level prediction model and determining a target probability threshold based on said recall rate comprises:
defining a plurality of candidate probability threshold values by adopting a difference method;
aiming at each candidate probability threshold, calculating a recall rate according to a prediction grade label output by the user grade prediction model and a corresponding user grade label;
and determining the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
Recall refers to the proportion of all that is actually positive that is correctly predicted to be positive. The recall ratio is calculated by the following calculation formula: recall is TP/(TP + FN), where TP is a positive class predicted as a positive class and FN is a positive class predicted as a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.
For example, assuming that the defined candidate probability thresholds are 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the process of determining the target probability threshold is as follows:
firstly, aiming at a candidate probability threshold value of 0.2, comparing a prediction probability value output by a user grade prediction model in a training stage with a subsequent probability threshold value of 0.2, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.2, obtaining a prediction grade label as a first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.2, obtaining the prediction grade label as a second grade, and calculating according to the prediction grade label in the training stage and a corresponding user grade label to obtain a first recall rate;
then aiming at the candidate probability threshold value of 0.3, comparing the prediction probability value output by the user grade prediction model in the training stage with the subsequent probability threshold value of 0.3, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.3, obtaining the prediction grade label as the first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.3, obtaining the prediction grade label as the second grade, and calculating according to the prediction grade label in the training stage and the corresponding user grade label to obtain the second recall rate;
and so on;
and then, aiming at the candidate probability threshold value of 0.9, comparing the prediction probability value output by the user grade prediction model in the training stage with the subsequent probability threshold value of 0.9, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.9, obtaining that the prediction grade label is a first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.9, obtaining that the prediction grade label is a second grade, and calculating according to the prediction grade label in the training stage and the corresponding user grade label to obtain a ninth recall rate.
And finally, comparing the nine recall rates, and selecting the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
The target probability threshold is determined through the maximum recall rate of the user grade prediction model, so that the grade of the user can be predicted more accurately.
It is emphasized that the user-level prediction model may be stored in a node of the blockchain in order to further ensure privacy and security of the user-level prediction model.
The user grade prediction method based on artificial intelligence can be applied to intelligent government affairs to promote the development of intelligent cities. According to the method, after the multiple data indexes of the multiple users are obtained and the user grade labels of the multiple users are obtained, the saturation and the relevance of each data index are calculated, the multiple module-entering data indexes are extracted from the multiple data indexes according to the saturation and the relevance, and the efficiency of training a user grade prediction model based on the multiple module-entering data indexes is improved as the data quantity of the multiple module-entering data indexes is far smaller than that of the multiple data indexes; the method comprises the steps that a plurality of model data indexes and corresponding user grade labels are input to a first input layer in a preset neural network frame, all nodes of a current layer full-connection layer are obtained, all nodes of the current layer full-connection layer are grouped according to a preset grouping rule, a target node in each group is determined, a plurality of target nodes of the current layer full-connection layer are used for carrying out full-connection training on a next layer full-connection layer of the current layer until the full-connection training of the last layer full-connection layer is completed, the number of nodes participating in and transmitting in each layer full-connection layer is reduced through grouping, the calculated amount of a neural network is reduced, and the efficiency of training a user grade prediction model is further improved; finally, obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model; and the user grade prediction model is used for carrying out grade prediction on the target user, so that the accuracy of the grade prediction of the user can be improved.
Fig. 2 is a block diagram of an artificial intelligence based user level prediction apparatus according to a second embodiment of the present invention.
In some embodiments, the artificial intelligence based user rating prediction apparatus 20 may comprise a plurality of functional modules comprised of computer program segments. The computer programs of the various program segments in the artificial intelligence based user level prediction apparatus 20 may be stored in a memory of the terminal and executed by at least one processor to perform (see, in detail, fig. 1) the functions of artificial intelligence based user level prediction.
In this embodiment, the user level prediction apparatus 20 based on artificial intelligence may be divided into a plurality of function modules according to the functions performed by the apparatus. The functional module may include: a calculation module 201, an extraction module 202, an input module 203, a grouping module 204, a training module 205, and a prediction module 206. The module referred to herein is a series of computer program segments capable of being executed by at least one processor and capable of performing a fixed function and is stored in memory. In the present embodiment, the functions of the modules will be described in detail in the following embodiments.
The calculating module 201 is configured to obtain a plurality of data indexes of a plurality of users, obtain user level labels of the plurality of users, calculate saturation of each data index, and calculate correlation of each data index.
Wherein the plurality of data metrics may include, but is not limited to: age, gender, income achievement, buried point behavior, etc. Each user corresponds to a plurality of data indexes and user grade labels.
Wherein, the user may be an insurance agent or a company salesperson, and the user rating label may include: a first level and a second level. Wherein the first level is higher than the second level. Illustratively, the first grade is a diamond grade and the second grade is a non-diamond grade.
Because the data indexes of each user can be thousands or even tens of thousands, the training time of the user grade prediction model is longer when the user grade prediction model is trained by using the data indexes, and therefore, the data indexes suitable for entering the model are screened out by calculating the saturation of each data index and calculating the correlation of each data index, so that the training time of the user grade prediction model is shortened, the training efficiency of the user grade prediction model is improved, and the efficiency of user grade prediction is improved.
In an alternative embodiment, the calculating module 201 calculates the saturation of each data index includes:
traversing a plurality of characteristic values of each data index;
calculating a first number of characteristic values matched with a preset characteristic value in the plurality of characteristic values, and calculating the loss rate of the data index according to the first number;
calculating a second number of the plurality of feature values having the same feature value, and calculating a repetition rate of the data index according to the second number;
and calculating the saturation of the data index according to the missing rate and the repetition rate.
The characteristic values of the same data index of different users may be different or the same. For example, the sex data index of the partial user is female, and the sex data index of the partial user is male. As another example, the age data index, a characteristic value of the user's age data index, may be distributed between 18-60 years of age.
Generally speaking, the number of eigenvalues of the same data index should be as many as there are users. However, in practical applications, a missing or missing phenomenon may occur when data of a user is collected, so that characteristic values of some data indexes of some users are null. Calculating the ratio of the first number of empty eigenvalues to the number of users in a certain data index can determine the loss rate of the data index. Calculating the ratio of the second number of identical feature values in a data index to the number of users can determine the repetition rate of the data index.
After the missing rate and the repetition rate of the data index are calculated, calculating a first product between the missing rate and a preset first weight, and calculating a second product between the repetition rate and a preset second weight; and finally, calculating the sum of the first product and the second product to obtain the saturation of the data index. The sum of the preset first weight and the preset second weight is 1, and the preset first weight is smaller than the preset second weight.
In an optional embodiment, the calculating module 201 calculates the degree of correlation of each data index includes:
generating an eigenvalue vector according to the plurality of eigenvalues of each data index;
generating a grade label vector according to the user grade labels of the plurality of historical users;
calculating a Pearson coefficient between the eigenvalue vector and the level label vector;
determining the Pearson coefficient as a degree of correlation of the data indicator.
Whether the data index is associated with the first grade or not is represented by calculating a Pearson coefficient between the characteristic value vector of the data index and the grade label vector, and the data index suitable for entering the mode is conveniently selected according to the Pearson coefficient.
The extracting module 202 is configured to extract a plurality of modelled data indicators from the plurality of data indicators of each user according to the saturation and the correlation.
In an optional embodiment, the extracting module 202 extracts a plurality of modelled data indicators from the plurality of data indicators of each user according to the saturation and the correlation, including:
acquiring a plurality of first data indexes corresponding to saturation greater than a preset saturation threshold from the plurality of data indexes;
acquiring a plurality of second data indexes corresponding to the correlation degree smaller than a preset correlation degree threshold value from the plurality of first data indexes;
deriving a plurality of high-order data indicators according to the plurality of second data indicators;
and taking the plurality of second data indexes and the plurality of high-order data indexes as the mode-entering data indexes.
For data indexes with too large missing rate, the neural network cannot learn the characteristics of the corresponding data indexes. Therefore, the data indexes corresponding to the saturation degree smaller than or equal to the preset saturation degree threshold value are removed, the data indexes corresponding to the saturation degree larger than the preset saturation degree threshold value are reserved, the number of the data indexes of the user grade prediction model entering the model is reduced, useless data indexes are removed, noise data are reduced, and the learning effect of the user grade prediction model is improved.
If all the characteristic values corresponding to the age data index are 20, or all the characteristic values corresponding to the gender data index are female or male, the age data index or the gender data index has no learning significance for the user grade prediction model. Therefore, the data indexes corresponding to the correlation degree larger than or equal to the preset correlation degree threshold value are removed, and the data indexes corresponding to the saturation degree smaller than the preset saturation degree threshold value are reserved, so that the number of the data indexes of the user level prediction model entering the model can be further reduced, the noise data can be further reduced, and the learning effect of the user level prediction model can be improved.
The input module 203 is configured to input a plurality of module entering data indicators of the plurality of users and the user level labels to a first input layer in a preset neural network framework.
A convolutional neural network may be obtained as a pre-set neural network framework. The convolutional neural network comprises a first layer of input layer, a plurality of full-connection layers and a last layer of output layer. The first layer input layer is connected with the first layer full connecting layer, the first layer full connecting layer is connected with the second layer full connecting layer, the second layer full connecting layer is connected with the third layer full connecting layer, and the like, and the last layer full connecting layer is connected with the last layer output layer. The number of full connection layers can be set according to actual conditions.
And taking the characteristic value of the input data index of each user and the corresponding user grade label as a data pair, constructing a data set according to the data pairs of a plurality of users, and dividing the data set into a training data set and a testing data set according to the input time of the users. And inputting a training data set to a first layer input layer in a preset neural network framework for learning and training.
The grouping module 204 is configured to obtain all nodes of a current layer full-link layer, group all nodes of the current layer full-link layer according to a preset grouping rule, determine a target node in each group, and perform full-link training on a next layer full-link layer of the current layer by using a plurality of target nodes of the current layer full-link layer until the full-link training on a last layer full-link layer is completed.
The number of nodes in the same group can be selected according to the calculation speed to be achieved by the user level prediction model, wherein the faster the calculation speed of the user level prediction model is, the more the number of nodes in the same group is, and the slower the calculation speed of the user level prediction model is, the less the number of nodes in the same group is. Calculating the target node and a preset weight, and transmitting the result to the next full-connection layer; and aiming at the non-target nodes, the operation is not carried out on the non-target nodes and the preset weight, and the non-target nodes are not transmitted to the next full-connection layer.
However, in order to avoid excessive grouping and reduce the prediction accuracy of the level prediction model, the number of nodes in the same group needs to be determined by stepwise heuristics, so as to improve the training efficiency of the user level prediction model and improve the prediction accuracy of the level prediction model.
In an optional embodiment, the grouping module 204 obtains all nodes of the current layer full connection layer, groups all nodes of the current layer full connection layer according to a preset grouping rule, and determines a target node in each group, where the grouping includes:
acquiring first node values of all nodes in a current layer full-connection layer;
grouping all the nodes in the current layer full-connection layer by adopting a step-by-step heuristic grouping method, and determining a node corresponding to the maximum first node value in each group in each heuristic process as a target node;
acquiring second node values of all nodes in the next full-connection layer after full-connection training;
for each heuristic grouping, calculating a loss entropy between the first node value and the second node value;
and determining the heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all the nodes in the current layer full-connection layer by using the target heuristic grouping method.
The grouping can be progressively explored according to the positions of all nodes of each layer of the full-connection layer. Grouping the first layer of full connection layers by adopting a step-by-step heuristic grouping method, grouping the second layer of full connection layers by adopting a step-by-step heuristic grouping method, and so on, and grouping the last layer of full connection layers by adopting a step-by-step heuristic grouping method.
The process of grouping each layer of the fully-connected layer by adopting the step-by-step heuristic grouping method comprises the following steps:
grouping two nodes which are close together into the same group by first probing;
grouping the three nodes which are close together into the same group by the second trial grouping;
……;
and (3) grouping the N +1 nodes which are close together into the same group by the nth trial grouping.
After each trial grouping, calculating the distance between a first node value of the current layer full-connection layer and a second node value of a next layer full-connection layer of the current layer full-connection layer to obtain the loss entropy. The larger the loss entropy is, the more the characteristic loss is caused after all the nodes after grouping are transmitted to the next full-connection layer, and the smaller the loss entropy is, the less the characteristic loss is caused after all the nodes after grouping are transmitted to the next full-connection layer.
Selecting a heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all the nodes in the current layer full-connection layer by using the target heuristic grouping method, so that the characteristics of all the grouped nodes are not lost after the nodes are transmitted to the next layer full-connection layer; and the number of target nodes participating in the transmission is greatly reduced compared with the number of nodes before grouping, so that the calculation amount of each full-connection layer can be reduced, and the training speed and efficiency of the whole user level prediction model are improved.
The training module 205 is configured to obtain a prediction level label output by a last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.
And calculating according to the prediction grade label and the user grade label to obtain a test passing rate, when the test passing rate is smaller than a preset passing rate threshold value, reclassifying the data set into a new training data set and a new test data set, using the new training data set to retrain the user grade prediction model, using the new test data set to retest the test passing rate of the user grade prediction model until the test passing rate is larger than the preset passing rate threshold value, and informing the user of the training of the user grade prediction model.
The prediction module 206 is configured to perform level prediction on the target user by using the user level prediction model.
In order to meet the consistency of the user grade prediction model on the parameter entering requirements in the training stage and the prediction stage, a target data index corresponding to the parameter entering data index is obtained from a plurality of data indexes of a target user in the current month, and the characteristic value of the target data index is output to the user grade prediction model to be predicted to obtain a probability value. The probability value is used to represent the likelihood that the target user can be promoted to the first level in the next month of the current month.
In an alternative embodiment, the predicting module 206 performs the level prediction on the target user by using the user level prediction model includes:
calculating the recall rate of the user level prediction model and determining a target probability threshold according to the recall rate;
obtaining a prediction index of the target user and inputting the prediction index into the user grade prediction model for prediction to obtain a prediction probability;
comparing the predicted probability to the target probability threshold;
when the prediction probability is greater than the target probability threshold, determining that the target user is in a first grade;
and when the prediction probability is smaller than or equal to the target probability threshold, determining that the target user is in a second level.
And the target probability threshold is calculated according to the recall rate of the user level prediction model. In the prediction stage, the higher the prediction probability output by the user level prediction model is, the higher the possibility that the target user can be lifted to the first level is (for example, the higher the possibility that the target user can be lifted); the smaller the prediction probability output by the user level prediction model is, the greater the likelihood that the target user cannot be raised to the first level (e.g., the less likely it is to be able to raise the drill).
In an optional embodiment, said calculating a recall rate of said user level prediction model and determining a target probability threshold based on said recall rate comprises:
defining a plurality of candidate probability threshold values by adopting a difference method;
aiming at each candidate probability threshold, calculating a recall rate according to a prediction grade label output by the user grade prediction model and a corresponding user grade label;
and determining the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
Recall refers to the proportion of all that is actually positive that is correctly predicted to be positive. The recall ratio is calculated by the following calculation formula: recall is TP/(TP + FN), where TP is a positive class predicted as a positive class and FN is a positive class predicted as a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.
For example, assuming that the defined candidate probability thresholds are 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the process of determining the target probability threshold is as follows:
firstly, aiming at a candidate probability threshold value of 0.2, comparing a prediction probability value output by a user grade prediction model in a training stage with a subsequent probability threshold value of 0.2, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.2, obtaining a prediction grade label as a first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.2, obtaining the prediction grade label as a second grade, and calculating according to the prediction grade label in the training stage and a corresponding user grade label to obtain a first recall rate;
then aiming at the candidate probability threshold value of 0.3, comparing the prediction probability value output by the user grade prediction model in the training stage with the subsequent probability threshold value of 0.3, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.3, obtaining the prediction grade label as the first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.3, obtaining the prediction grade label as the second grade, and calculating according to the prediction grade label in the training stage and the corresponding user grade label to obtain the second recall rate;
and so on;
and then, aiming at the candidate probability threshold value of 0.9, comparing the prediction probability value output by the user grade prediction model in the training stage with the subsequent probability threshold value of 0.9, if the prediction probability value output in the training stage is greater than or equal to the subsequent probability threshold value of 0.9, obtaining that the prediction grade label is a first grade, if the prediction probability value output in the training stage is less than the subsequent probability threshold value of 0.9, obtaining that the prediction grade label is a second grade, and calculating according to the prediction grade label in the training stage and the corresponding user grade label to obtain a ninth recall rate.
And finally, comparing the nine recall rates, and selecting the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
The target probability threshold is determined through the maximum recall rate of the user grade prediction model, so that the grade of the user can be predicted more accurately.
It is emphasized that the user-level prediction model may be stored in a node of the blockchain in order to further ensure privacy and security of the user-level prediction model.
The user grade prediction device based on artificial intelligence can be applied to intelligent government affairs to promote the development of intelligent cities. According to the method, after the multiple data indexes of the multiple users are obtained and the user grade labels of the multiple users are obtained, the saturation and the relevance of each data index are calculated, the multiple module-entering data indexes are extracted from the multiple data indexes according to the saturation and the relevance, and the efficiency of training a user grade prediction model based on the multiple module-entering data indexes is improved as the data quantity of the multiple module-entering data indexes is far smaller than that of the multiple data indexes; the method comprises the steps that a plurality of model data indexes and corresponding user grade labels are input to a first input layer in a preset neural network frame, all nodes of a current layer full-connection layer are obtained, all nodes of the current layer full-connection layer are grouped according to a preset grouping rule, a target node in each group is determined, a plurality of target nodes of the current layer full-connection layer are used for carrying out full-connection training on a next layer full-connection layer of the current layer until the full-connection training of the last layer full-connection layer is completed, the number of nodes participating in and transmitting in each layer full-connection layer is reduced through grouping, the calculated amount of a neural network is reduced, and the efficiency of training a user grade prediction model is further improved; finally, obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model; and the user grade prediction model is used for carrying out grade prediction on the target user, so that the accuracy of the grade prediction of the user can be improved.
Fig. 3 is a schematic structural diagram of a terminal according to a third embodiment of the present invention. In the preferred embodiment of the present invention, the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
It will be appreciated by those skilled in the art that the configuration of the terminal shown in fig. 3 is not limiting to the embodiments of the present invention, and may be a bus-type configuration or a star-type configuration, and the terminal 3 may include more or less hardware or software than those shown, or a different arrangement of components.
In some embodiments, the terminal 3 is a terminal capable of automatically performing numerical calculation and/or information processing according to preset or stored instructions, and the hardware includes but is not limited to a microprocessor, an application specific integrated circuit, a programmable gate array, a digital processor, an embedded device, and the like. The terminal 3 may further include a client device, which includes, but is not limited to, any electronic product capable of performing human-computer interaction with a client through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device, for example, a personal computer, a tablet computer, a smart phone, a digital camera, and the like.
It should be noted that the terminal 3 is only an example, and other existing or future electronic products, such as those that can be adapted to the present invention, should also be included in the scope of the present invention, and are included herein by reference.
In some embodiments, the memory 31 has stored therein a computer program that, when executed by the at least one processor 32, performs all or part of the steps of the artificial intelligence based user rating prediction method as described. The Memory 31 includes a Read-Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Programmable Read-Only Memory (EPROM), a One-time Programmable Read-Only Memory (OTPROM), an electronically Erasable rewritable Read-Only Memory (Electrically-Erasable Programmable Read-Only Memory (EEPROM)), an optical Read-Only disk (CD-ROM) or other optical disk Memory, a magnetic disk Memory, a tape Memory, or any other medium readable by a computer capable of carrying or storing data.
Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the blockchain node, and the like.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In some embodiments, the at least one processor 32 is a Control Unit (Control Unit) of the terminal 3, connects various components of the entire terminal 3 by using various interfaces and lines, and executes various functions and processes data of the terminal 3 by running or executing programs or modules stored in the memory 31 and calling data stored in the memory 31. For example, the at least one processor 32, when executing the computer program stored in the memory, implements all or a portion of the steps of the artificial intelligence based user rating prediction method described in embodiments of the present invention; or implement all or part of the functionality of an artificial intelligence based user level prediction apparatus. The at least one processor 32 may be composed of an integrated circuit, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips.
In some embodiments, the at least one communication bus 33 is arranged to enable connection communication between the memory 31 and the at least one processor 32 or the like.
Although not shown, the terminal 3 may further include a power supply (such as a battery) for supplying power to various components, and preferably, the power supply may be logically connected to the at least one processor 32 through a power management device, so as to implement functions of managing charging, discharging, and power consumption through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The terminal 3 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.
The integrated unit implemented in the form of a software functional module may be stored in a computer-readable storage medium. The software functional module is stored in a storage medium and includes several instructions to enable a terminal (which may be a personal computer, a terminal, or a network device) or a processor (processor) to execute parts of the methods according to the embodiments of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or that the singular does not exclude the plural. A plurality of units or means recited in the apparatus claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A method for user rating prediction based on artificial intelligence, the method comprising:
the method comprises the steps of obtaining a plurality of data indexes of a plurality of users, obtaining user level labels of the users, calculating the saturation of each data index and calculating the correlation of each data index;
extracting a plurality of modelled data indexes from the plurality of data indexes of each user according to the saturation and the relevance;
inputting a plurality of in-mold data indexes of a plurality of users and the user grade labels to a first input layer in a preset neural network framework, wherein the preset neural network framework further comprises a plurality of full-connection layers and a last output layer;
all nodes acquire all nodes of a current layer full link layer, group all nodes of the current layer full link layer according to a preset grouping rule and determine a target node in each group, and perform full link training on a next layer full link layer of the current layer by using a plurality of target nodes of the current layer full link layer until the full link training on the last layer full link layer is completed;
obtaining a prediction grade label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model;
and performing grade prediction on the target user by using the user grade prediction model.
2. The artificial intelligence based user class prediction method of claim 1, wherein the obtaining all nodes of a current layer full link layer, grouping all nodes of the current layer full link layer according to a preset grouping rule and determining a target node in each group comprises:
acquiring first node values of all nodes in a current layer full-connection layer;
grouping all nodes in the current layer full-connection layer by adopting a step-by-step heuristic grouping method, and determining a node corresponding to the maximum first node value in each group in each heuristic process as a target node;
performing full connection training on a next full connection layer of the current layer by using the target node;
acquiring second node values of all nodes in the next full-connection layer after full-connection training;
for each heuristic grouping, calculating a loss entropy between the first node value and the second node value;
and determining the heuristic grouping method corresponding to the minimum loss entropy as a target heuristic grouping method, and grouping all nodes in the current layer full-connection layer by using the target heuristic grouping method.
3. The artificial intelligence based user ratings prediction method of claim 1, wherein said rating prediction of target users using the user ratings prediction model comprises:
calculating the recall rate of the user level prediction model and determining a target probability threshold according to the recall rate;
obtaining a prediction index of the target user and inputting the prediction index into the user grade prediction model for prediction to obtain a prediction probability;
comparing the predicted probability to the target probability threshold;
when the prediction probability is greater than the target probability threshold, determining that the target user is in a first grade;
and when the prediction probability is smaller than or equal to the target probability threshold, determining that the target user is in a second level.
4. The artificial intelligence based user ratings prediction method of claim 1, wherein the calculating a recall rate of the user ratings prediction model and determining a target probability threshold based on the recall rate comprises:
defining a plurality of candidate probability threshold values by adopting a difference method;
aiming at each candidate probability threshold, calculating a recall rate according to a prediction grade label output by the user grade prediction model and a corresponding user grade label;
and determining the candidate probability threshold corresponding to the maximum recall rate as a target probability threshold.
5. The artificial intelligence based user rating prediction method of any of claims 1 to 4, wherein the calculating the saturation of each data metric comprises:
traversing a plurality of characteristic values of each data index;
calculating a first number of characteristic values matched with a preset characteristic value in the plurality of characteristic values, and calculating the loss rate of the data index according to the first number;
calculating a second number of the plurality of feature values having the same feature value, and calculating a repetition rate of the data index according to the second number;
and calculating the saturation of the data index according to the missing rate and the repetition rate.
6. The artificial intelligence based user rating prediction method of claim 5, wherein the calculating the relevance of each data metric comprises:
generating an eigenvalue vector according to the plurality of eigenvalues of each data index;
generating a grade label vector according to the user grade labels of the plurality of historical users;
calculating a Pearson coefficient between the eigenvalue vector and the level label vector;
determining the Pearson coefficient as a degree of correlation of the data indicator.
7. The artificial intelligence based user rating prediction method of claim 6, wherein the extracting a plurality of modelled data metrics from the plurality of data metrics for each user based on the saturation and the correlation comprises:
acquiring a plurality of first data indexes corresponding to saturation greater than a preset saturation threshold from the plurality of data indexes;
acquiring a plurality of second data indexes corresponding to the correlation degree smaller than a preset correlation degree threshold value from the plurality of first data indexes;
deriving a plurality of high-order data indicators according to the plurality of second data indicators;
and taking the plurality of second data indexes and the plurality of high-order data indexes as the mode-entering data indexes.
8. An artificial intelligence based user level prediction apparatus, the apparatus comprising:
the calculation module is used for acquiring a plurality of data indexes of a plurality of users, acquiring user level labels of the plurality of users, calculating the saturation of each data index and calculating the correlation of each data index;
the extraction module is used for extracting a plurality of input data indexes from the plurality of data indexes of each user according to the saturation and the correlation;
the input module is used for inputting a plurality of module entering data indexes of a plurality of users and the user grade labels to a first input layer in a preset neural network framework, wherein the preset neural network framework further comprises a plurality of layers of full connection layers and a last output layer;
the grouping module is used for acquiring all nodes of a current layer full-link layer, grouping all nodes of the current layer full-link layer according to a preset grouping rule and determining a target node in each group, and performing full-link training on a next layer full-link layer of the current layer by using a plurality of target nodes of the current layer full-link layer until the full-link training on the last layer full-link layer is completed;
the training module is used for acquiring a prediction grade label output by the last output layer of the preset neural network framework and iteratively training the preset neural network framework according to the prediction grade label to obtain a user grade prediction model;
and the prediction module is used for carrying out grade prediction on the target user by using the user grade prediction model.
9. A terminal, characterized in that the terminal comprises a processor for implementing the artificial intelligence based user rating prediction method according to any of claims 1 to 7 when executing a computer program stored in a memory.
10. A computer-readable storage medium, having stored thereon a computer program, which, when being executed by a processor, carries out the artificial intelligence based user rating prediction method according to any of the claims 1 to 7.
CN202011092932.5A 2020-10-13 2020-10-13 User grade prediction method, device, terminal and medium based on artificial intelligence Pending CN112102011A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011092932.5A CN112102011A (en) 2020-10-13 2020-10-13 User grade prediction method, device, terminal and medium based on artificial intelligence
PCT/CN2020/131955 WO2021139432A1 (en) 2020-10-13 2020-11-26 Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011092932.5A CN112102011A (en) 2020-10-13 2020-10-13 User grade prediction method, device, terminal and medium based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN112102011A true CN112102011A (en) 2020-12-18

Family

ID=73783614

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011092932.5A Pending CN112102011A (en) 2020-10-13 2020-10-13 User grade prediction method, device, terminal and medium based on artificial intelligence

Country Status (2)

Country Link
CN (1) CN112102011A (en)
WO (1) WO2021139432A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818028A (en) * 2021-01-12 2021-05-18 平安科技(深圳)有限公司 Data index screening method and device, computer equipment and storage medium

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723524B (en) * 2021-08-31 2024-05-17 深圳平安智慧医健科技有限公司 Data processing method based on prediction model, related equipment and medium
CN117112574B (en) * 2023-10-20 2024-02-23 美云智数科技有限公司 Tree service data construction method, device, computer equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110874758A (en) * 2018-09-03 2020-03-10 北京京东金融科技控股有限公司 Potential customer prediction method, device, system, electronic equipment and storage medium
CN109711860A (en) * 2018-11-12 2019-05-03 平安科技(深圳)有限公司 Prediction technique and device, storage medium, the computer equipment of user behavior
CN110674716A (en) * 2019-09-16 2020-01-10 腾讯云计算(北京)有限责任公司 Image recognition method, device and storage medium
CN110852785B (en) * 2019-10-12 2023-11-21 中国平安人寿保险股份有限公司 User grading method, device and computer readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818028A (en) * 2021-01-12 2021-05-18 平安科技(深圳)有限公司 Data index screening method and device, computer equipment and storage medium
CN112818028B (en) * 2021-01-12 2021-09-17 平安科技(深圳)有限公司 Data index screening method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2021139432A9 (en) 2021-09-23
WO2021139432A1 (en) 2021-07-15

Similar Documents

Publication Publication Date Title
CN112102011A (en) User grade prediction method, device, terminal and medium based on artificial intelligence
CN112445854B (en) Multi-source service data real-time processing method, device, terminal and storage medium
CN111950738A (en) Machine learning model optimization effect evaluation method and device, terminal and storage medium
CN110688478B (en) Answer sorting method, device and storage medium
CN112016905B (en) Information display method and device based on approval process, electronic equipment and medium
CN112862546B (en) User loss prediction method and device, computer equipment and storage medium
CN112948275A (en) Test data generation method, device, equipment and storage medium
CN114997263B (en) Method, device, equipment and storage medium for analyzing training rate based on machine learning
CN114612194A (en) Product recommendation method and device, electronic equipment and storage medium
CN113435582A (en) Text processing method based on sentence vector pre-training model and related equipment
CN112818028B (en) Data index screening method and device, computer equipment and storage medium
CN112598135A (en) Model training processing method and device, computer equipment and medium
CN116108276A (en) Information recommendation method and device based on artificial intelligence and related equipment
CN113657546B (en) Information classification method, device, electronic equipment and readable storage medium
CN114968336A (en) Application gray level publishing method and device, computer equipment and storage medium
CN115271821A (en) Dot distribution processing method, dot distribution processing device, computer equipment and storage medium
CN112668788A (en) User scoring model training method based on deep learning and related equipment
CN114664458A (en) Patient classification device, computer device and storage medium
CN111651652B (en) Emotion tendency identification method, device, equipment and medium based on artificial intelligence
CN113342940A (en) Text matching analysis method and device, electronic equipment and storage medium
CN114240677A (en) Medical data risk identification method and device, electronic equipment and storage medium
CN112381595B (en) User value prediction method based on communication behavior and related equipment
CN113553513B (en) Course recommendation method and device based on artificial intelligence, electronic equipment and medium
CN113486056B (en) Knowledge graph-based learning condition acquisition method and device and related equipment
CN112801144B (en) Resource allocation method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201218