WO2021139432A1

WO2021139432A1 - Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium

Info

Publication number: WO2021139432A1
Application number: PCT/CN2020/131955
Authority: WO
Inventors: 吴志成; 张莉
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-10-13
Filing date: 2020-11-26
Publication date: 2021-07-15
Also published as: WO2021139432A9; CN112102011A

Abstract

The present application relates to the technical field of artificial intelligence, and provides an artificial intelligence-based user rating prediction method and apparatus, a terminal, and a medium. The method comprises: calculating a saturation and a correlation of each data indicator; extracting a plurality of modelling data indicators from among the plurality of data indicators according to the saturations and the correlations, and inputting into a first input layer in a preset neural network framework; grouping all nodes of a current fully connected layer according to a preset grouping rule, determining a target node in each group, and using the plurality of target nodes of the current fully connected layer to perform fully connected training on a next fully connected layer, until training of a last fully connected layer is complete; iteratively training the preset neural network framework according to a predicted rating label outputted by a last output layer of the preset neural network framework, to obtain a user rating prediction model; using the user rating prediction model to perform rating prediction on a target user. The present application can increase the efficiency of user rating prediction, and improve the accuracy of user rating prediction.

Description

Artificial intelligence-based user level prediction method, device, terminal and medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 13, 2020, the application number is 202011092932.5, and the invention title is "artificial intelligence-based user level prediction method, device, terminal, and medium", and its entire content Incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, device, terminal, and medium for user level prediction based on artificial intelligence.

Background technique

Grade is an indicator used by insurance companies to assess insurance agents. It is assessed at the beginning of each month based on the performance of the insurance agent in the previous month. If the insurance agent can predict whether the insurance agent can be raised by one level in the next month based on the performance of the insurance agent in the current month, it will not only increase the enthusiasm of the insurance agent, but also help insurance companies plan their overall sales targets and improve the overall performance of the insurance company.

In the prior art, a machine learning model is trained to predict whether a user can increase a level, for example, predict whether a non-diamond-level insurance agent can be upgraded to a diamond-level insurance agent. However, the inventor found in the process of realizing this application that there are as many as tens of thousands of data indicators for users. Using so many data indicators to train the machine learning model results in longer training time and lower user level prediction efficiency; and some useless Data indicators will also affect the learning accuracy of the machine learning model, resulting in poor user level prediction.

Summary of the invention

In view of the above content, it is necessary to propose an artificial intelligence-based user level prediction method, device, terminal, and medium, which can improve the efficiency of user level prediction and the accuracy of user level prediction.

The first aspect of the present application provides an artificial intelligence-based user level prediction method, the method includes:

Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;

Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;

Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The user level prediction model is used to predict the level of the target user.

A second aspect of the present application provides an artificial intelligence-based user level prediction device, which includes:

A calculation module for obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

An extracting module, configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

The input module is used to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, wherein the preset neural network framework also includes multiple Fully connected layer and the last output layer;

The grouping module is used to obtain all the nodes of the current fully connected layer, group all the nodes of the current fully connected layer according to preset grouping rules and determine the target node in each grouping, and use the current layer fully connected The multiple target nodes of the layer perform fully-connected training on the next fully-connected layer of the current layer until the fully-connected training of the last fully-connected layer is completed;

A training module, configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The prediction module is configured to use the user level prediction model to predict the level of the target user.

A third aspect of the present application provides a terminal, the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:

A fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

In summary, the artificial intelligence-based user level prediction method, device, terminal, and medium described in this application calculates each user level after obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users. The saturation and relevance of each data indicator, and extract multiple data indicators from multiple data indicators based on the saturation and relevance. Since the data volume of multiple data indicators is much smaller than that of multiple data indicators, The efficiency of training user level prediction models based on multiple entry data indicators is improved; by inputting multiple entry data indicators and corresponding user level labels into the first input layer of the preset neural network framework, each layer is gradually obtained Connect all nodes of the layer and group all nodes according to preset grouping rules, determine the target node in each group, use multiple target nodes of each fully connected layer to perform fully connected training of the next fully connected layer, pass Grouping reduces the number of nodes participating in the transfer in each fully connected layer, reduces the amount of calculation of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the output of the last layer of the preset neural network framework Predict the grade label, iteratively train the preset neural network framework according to the predicted grade label to obtain the user grade prediction model; use the user grade prediction model to predict the target user grade, which can improve the accuracy of user grade prediction.

Description of the drawings

Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application.

Fig. 2 is a structural diagram of an artificial intelligence-based user level prediction device provided in the second embodiment of the present application.

FIG. 3 is a schematic structural diagram of a terminal provided in Embodiment 3 of the present application.

Detailed ways

In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the specification of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.

The artificial intelligence-based user level prediction method is executed by the terminal, and accordingly, the artificial intelligence-based user level prediction device runs in the terminal.

Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application. The artificial intelligence-based user level prediction method specifically includes the following steps. According to different needs, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.

S11. Obtain multiple data indicators of multiple users and obtain user level labels of the multiple users, and calculate the saturation of each data indicator and calculate the correlation degree of each data indicator.

Wherein, the multiple data indicators may include, but are not limited to: age, gender, income performance, burying behavior, etc. Each user corresponds to multiple data indicators and user level labels.

Wherein, the user may be an insurance agent or a company salesperson, etc., and the user level label may include: a first level and a second level. Wherein, the first level is higher than the second level. Exemplarily, the first grade is a diamond grade, and the second grade is a non-diamond grade.

Since there may be thousands or even tens of thousands of data indicators for each user, using such multiple data indicators to train the user level prediction model will lead to a longer training time for the user level prediction model. Therefore, by calculating the value of each data indicator Saturation and calculation of the correlation of each data indicator to filter out the data indicators suitable for the model, so as to reduce the training time of the user level prediction model, improve the training efficiency of the user level prediction model, and thereby improve the efficiency of user level prediction.

In an optional embodiment, the calculating the saturation of each data indicator includes:

Traverse multiple characteristic values of each data indicator;

Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;

Calculating a second number of the multiple characteristic values having the same characteristic value, and calculating the repetition rate of the data index according to the second number;

The saturation of the data index is calculated according to the missing rate and the repetition rate.

The characteristic value of the same data indicator for different users may be different or the same. For example, for the gender data indicator, the feature value of the gender data indicator of some users is female, and the feature value of the gender data indicator of some users is male. For another example, the age data index, the characteristic value of the user's age data index can be distributed between 18-60 years old.

Generally speaking, how many users are there and how many feature values of the same data indicator should be. However, in practical applications, there may be missing or omissions when collecting user data, resulting in empty characteristic values of certain data indicators for some users. Calculating the ratio of the first number of empty feature values of a certain data indicator to the number of users can determine the missing rate of the data indicator. Calculating the ratio of the second number of the same feature value to the number of users in a certain data indicator can determine the repetition rate of the data indicator.

After calculating the missing rate and repetition rate of the data indicators, calculate the first product between the missing rate and the preset first weight, and calculate the second product between the repetition rate and the preset second weight; finally calculate the first product The sum between the product and the second product gets the saturation of the data index. Wherein, the sum of the preset first weight and the preset second weight is 1, and the preset first weight is less than the preset second weight.

In an optional embodiment, the calculating the correlation of each data indicator includes:

Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;

Generating a level label vector according to the user level labels of the multiple historical users;

Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;

The Pearson coefficient is determined as the correlation degree of the data index.

By calculating the Pearson coefficient between the eigenvalue vector of the data index and the grade label vector, it is indicated whether the data index is related to the first level, and it is convenient to select the data index suitable for the model according to the Pearson coefficient.

S12, extracting multiple data indicators for model entry from the multiple data indicators of each user according to the saturation and the correlation.

In an optional embodiment, the extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:

Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;

Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;

Derive multiple high-level data indicators according to the multiple second data indicators;

The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.

For data indicators with a large missing rate, the neural network cannot learn the characteristics of the corresponding data indicators. Therefore, removing multiple data indicators corresponding to saturations less than or equal to the preset saturation threshold and retaining multiple data indicators corresponding to saturations greater than the preset saturation threshold not only reduces the data entered into the model of the user level prediction model The number of indicators, and the removal of useless data indicators to reduce noise data will help improve the learning effect of the user level prediction model.

If the feature values corresponding to the age data indicators are all 20, or the feature values corresponding to the gender data indicators are all female or all male, then the age data indicators or gender data indicators have no learning significance for the user level prediction model. Therefore, removing multiple data indicators corresponding to correlations greater than or equal to the preset correlation threshold, and retaining multiple data indicators corresponding to saturations less than the preset saturation threshold can not only further reduce the risk of user level prediction models. The number of data indicators can further reduce the noise data, which helps to improve the learning effect of the user level prediction model.

S13: Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.

A convolutional neural network can be acquired as a preset neural network framework. Among them, the convolutional neural network includes the first input layer, multiple fully connected layers, and the last output layer. The first input layer is connected to the first fully connected layer, the first fully connected layer is connected to the second fully connected layer, the second fully connected layer is connected to the third fully connected layer, and so on, the last layer The fully connected layer connects the last output layer. The number of fully connected layers can be set according to actual conditions.

Taking the feature value of each user’s entry data indicator and the corresponding user level label as a data pair, a data set can be constructed based on the data pairs of multiple users, and the data set is divided into divisions according to the user’s entry time Training data set and test data set. Input the training data set to the first input layer in the preset neural network framework for learning and training.

S14. Obtain all nodes of the current fully connected layer, group all nodes of the current fully connected layer according to preset grouping rules, determine the target node in each group, and use the multiple of the current fully connected layer. Each of the target nodes performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed.

The number of nodes in the same group can be selected according to the calculation speed of the user level prediction model. The faster the calculation speed of the user level prediction model, the greater the number of nodes in the same group, and the calculation speed of the user level prediction model The slower, the smaller the number of nodes in the same group. For the target node, the calculation is performed with the preset weight and passed to the next fully connected layer; for non-target nodes, the calculation is not performed with the preset weight, and the calculation is not passed to the next fully connected layer.

However, in order to avoid excessive grouping, resulting in a decrease in the prediction accuracy of the grade prediction model, it is necessary to gradually determine the number of nodes in the same group to improve the training efficiency of the user grade prediction model and improve the prediction accuracy of the grade prediction model.

In an optional embodiment, the acquiring all nodes of the current fully connected layer, grouping all the nodes of the current fully connected layer according to a preset grouping rule, and determining the target node in each group includes:

Get the first node value of all nodes in the current fully connected layer;

Grouping all the nodes in the current fully connected layer by using a step-by-step trial grouping method, and determining the node corresponding to the largest first node value in each group in the process of each trial as the target node;

Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;

For each trial group, calculate the loss entropy between the first node value and the second node value;

The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all the nodes in the current fully connected layer.

Step by step test grouping can be carried out according to the positions of all nodes in each fully connected layer. First use the stepwise trial grouping method to group the first fully connected layer, then use the stepwise trial grouping method to group the second layer fully connected layer, and so on, and finally use the stepwise trial grouping method to group the last fully connected layer Grouping.

The process of grouping each fully connected layer with the step-by-step trial grouping method is as follows:

The first trial grouping is to divide the two nodes that are next to each other into the same group;

The second trial grouping is to divide the three nodes that are next to each other into the same group;

...

In the Nth trial group, the N+1 nodes that are next to each other are divided into the same group.

After each trial grouping, the distance between the first node value of the current fully connected layer and the second node value of the next fully connected layer of the current fully connected layer is calculated to obtain the loss entropy. The larger the loss entropy, it means that all the nodes after the grouping are passed to the next fully connected layer, resulting in more feature loss, and the smaller the loss entropy is, indicating that all the nodes after the grouping are passed to the next fully connected layer. This results in less feature loss.

Select the heuristic grouping method corresponding to the smallest loss entropy as the target heuristic grouping method, and use the target heuristic grouping method to group all the nodes in the current fully connected layer, which can effectively guarantee all the nodes after the grouping After being passed to the next fully connected layer, the features will not be lost; and the number of target nodes participating in the transfer is greatly reduced compared with the number of nodes before grouping, which can reduce the amount of calculations in each fully connected layer and improve the entire The training speed and efficiency of the user level prediction model.

S15: Obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.

The test pass rate is calculated according to the predicted level label and user level label. When the test pass rate is less than the preset pass rate threshold, the data set is re-divided into a new training data set and a new test data set, and the new training is used The data set retrains the user level prediction model, and uses the new test data set to retest the test pass rate of the user level prediction model until the test pass rate is greater than the preset pass rate threshold, and the user level prediction model training is notified.

S16: Use the user level prediction model to predict the level of the target user.

In order to meet the consistency of the user level prediction model in the training phase and the prediction phase for the input requirements, the target data index corresponding to the model data index is obtained from the multiple data indicators of the target user in the current month, and the characteristic value of the target data index is output The probability value is obtained by prediction in the user level prediction model. The probability value is used to indicate the possibility that the target user can be promoted to the first level in the next month of the current month.

In an optional embodiment, the using the user level prediction model to perform level prediction on the target user includes:

Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;

Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;

Comparing the predicted probability with the target probability threshold;

When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;

When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.

Among them, the target probability threshold is calculated according to the recall rate of the user level prediction model. In the prediction stage, the higher the predicted probability output by the user level prediction model, the greater the probability that the target user can be promoted to the first level (for example, the greater the probability of being able to be promoted); the predicted probability output by the user level prediction model The smaller the value, the greater the probability that the target user cannot be promoted to the first level (for example, the less likely it is that the target user can be promoted).

In an optional embodiment, the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate includes:

Use the difference method to define multiple candidate probability thresholds;

For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;

The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.

The recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive. The recall rate is calculated using the following calculation formula: Recall=TP/(TP+FN), where TP is a positive class and is predicted as a positive class, and FN is a positive class and is predicted as a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.

Exemplarily, assuming that the defined multiple candidate probability thresholds are 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the process of determining the target probability threshold is as follows:

First, for the candidate probability threshold of 0.2, compare the predicted probability value output by the user level prediction model in the training phase with the subsequent probability threshold 0.2. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold 0.2, then the predicted level label is obtained. It is the first level, and the predicted probability value output in the training phase is less than the subsequent probability threshold 0.2, then the predicted level label is obtained as the second level, and the first recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;

Then for the candidate probability threshold of 0.3, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.3. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.3, and the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold of 0.3, then the predicted level label is obtained as the second level, and the second recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;

And so on;

Then for the candidate probability threshold of 0.9, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.9. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.9, then the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold 0.9, then the predicted level label is obtained as the second level, and the ninth recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label.

Finally, the nine recall rates are compared, and the candidate probability threshold corresponding to the largest recall rate is selected as the target probability threshold.

The target probability threshold is determined by the maximum recall rate of the user level prediction model, which can more accurately predict the user level.

It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned user-level prediction model, the above-mentioned user-level prediction model can be stored in a node of the blockchain.

The artificial intelligence-based user level prediction method described in this application can be applied to smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules. Group and determine the target node in each group, and use multiple target nodes of the current fully connected layer to perform fully connected training on the next fully connected layer of the current layer until the final fully connected layer is completed. The fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework The prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.

In some embodiments, the artificial intelligence-based user level prediction device 20 may include multiple functional modules composed of computer-readable instruction segments. The computer-readable instructions of each program segment in the artificial intelligence-based user level prediction device 20 may be stored in the memory of the terminal and executed by at least one processor to execute (see FIG. 1 for details). The function of user level prediction.

In this embodiment, the artificial intelligence-based user level prediction device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: a calculation module 201, an extraction module 202, an input module 203, a grouping module 204, a training module 205, and a prediction module 206. The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.

The calculation module 201 is configured to obtain multiple data indicators of multiple users and obtain user level labels of the multiple users, and calculate the saturation of each data indicator and the correlation degree of each data indicator.

Since there may be thousands or even tens of thousands of data indicators for each user, using such multiple data indicators to train the user level prediction model will lead to a longer training time for the user level prediction model. Therefore, by calculating the value of each data indicator Saturation and calculation of the correlation of each data indicator to filter out the data indicators suitable for entering the model, thereby reducing the training time of the user level prediction model, improving the training efficiency of the user level prediction model, and thereby improving the efficiency of user level prediction.

In an optional embodiment, the calculation module 201 calculating the saturation of each data indicator includes:

Traverse multiple characteristic values of each data indicator;

Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;

In an optional embodiment, the calculation module 201 calculating the correlation degree of each data indicator includes:

The extracting module 202 is configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation.

In an optional embodiment, the extracting module 202 extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:

The input module 203 is configured to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.

The grouping module 204 is configured to obtain all nodes of the current fully connected layer, group all nodes of the current fully connected layer according to preset grouping rules, determine the target node in each group, and use the current The multiple target nodes of the fully connected layer perform fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed.

In an optional embodiment, the grouping module 204 obtains all nodes of the current fully connected layer, groups all nodes of the current fully connected layer according to preset grouping rules, and determines the target in each group The nodes include:

Get the first node value of all nodes in the current fully connected layer;

...

The training module 205 is configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.

The prediction module 206 is configured to use the user level prediction model to predict the level of the target user.

In an optional embodiment, the prediction module 206 using the user level prediction model to perform level prediction on the target user includes:

Comparing the predicted probability with the target probability threshold;

Use the difference method to define multiple candidate probability thresholds;

The recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive. The recall rate is calculated using the following calculation formula: Recall=TP/(TP+FN), where TP is a positive class and is predicted to be a positive class, and FN is a positive class and is predicted to be a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.

And so on;

The artificial intelligence-based user level prediction device described in this application can be used in smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules. Group and determine the target node in each group, and use multiple target nodes of the current fully connected layer to perform fully connected training on the next fully connected layer of the current layer until the final fully connected layer is completed. The fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework The prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.

Refer to FIG. 3, which is a schematic structural diagram of a terminal provided in Embodiment 3 of this application. In a preferred embodiment of the present application, the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.

Those skilled in the art should understand that the structure of the terminal shown in FIG. 3 does not constitute a limitation of the embodiments of the present application. It may be a bus-type structure or a star structure. The terminal 3 may also include more than that shown in the figure. More or less other hardware or software, or different component arrangements.

In some embodiments, the terminal 3 is a terminal that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit, and Programming gate arrays, digital processors and embedded devices, etc. The terminal 3 may also include client equipment. The client equipment includes, but is not limited to, any electronic product that can interact with the client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer. Computers, tablets, smart phones, digital cameras, etc.

It should be noted that the terminal 3 is only an example. If other existing or future electronic products can be adapted to this application, they should also be included in the protection scope of this application and included here by reference.

In some embodiments, computer-readable instructions are stored in the memory 31, and when the computer-readable instructions are executed by the at least one processor 32, all of the aforementioned artificial intelligence-based user level prediction methods are implemented. Or part of the steps. The memory 31 includes volatile and non-volatile memory, such as random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only). Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronic Erasable Programmable Read-Only Memory, OTPROM Read memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or capable of carrying or storing data Computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile.

Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.

The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

In some embodiments, the at least one processor 32 is the control core (Control Unit) of the terminal 3. Various interfaces and lines are used to connect various components of the entire terminal 3, and are stored in the memory 31 through operation or execution. The programs or modules in the terminal 3 and the data stored in the memory 31 are called to execute various functions of the terminal 3 and process data. For example, when the at least one processor 32 executes the computer-readable instructions stored in the memory, all or part of the steps of the artificial intelligence-based user level prediction method described in the embodiments of the present application are implemented; or the artificial intelligence-based All or part of the functions of the user level prediction device. The at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units. (Central Processing unit, CPU), a combination of microprocessors, digital processing chips, graphics processors, and various control chips.

In some embodiments, the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.

Although not shown, the terminal 3 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 32 through a power management device, so as to realize management through the power management device. Functions such as charging, discharging, and power management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The terminal 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.

The above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes several instructions to make a terminal (which may be a personal computer, a terminal, or a network device, etc.) or a processor execute part of the method described in each embodiment of the present application .

In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any reference signs in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other elements or the singular number does not exclude the plural number. Multiple units or devices stated in the present invention can also be implemented by one unit or device through software or hardware. Words such as first and second are used to denote names, but do not denote any specific order.

Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims

An artificial intelligence-based user level prediction method, wherein the method includes:

Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;

Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;

Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The user level prediction model is used to predict the level of the target user.
The artificial intelligence-based user level prediction method according to claim 1, wherein said acquiring all nodes of the current fully connected layer, grouping all nodes of the current fully connected layer according to preset grouping rules and determining The target nodes in each group include:

Get the first node value of all nodes in the current fully connected layer;

Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;

Use the target node to perform fully connected training on the next fully connected layer of the current layer;

Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;

For each trial group, calculate the loss entropy between the first node value and the second node value;

The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
The method for predicting a user level based on artificial intelligence according to claim 1, wherein said using the user level prediction model to predict the level of a target user comprises:

Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;

Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;

Comparing the predicted probability with the target probability threshold;

When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;

When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
The artificial intelligence-based user level prediction method according to claim 1, wherein the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate comprises:

Use the difference method to define multiple candidate probability thresholds;

For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;

The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
The artificial intelligence-based user level prediction method according to any one of claims 1 to 4, wherein said calculating the saturation of each data indicator comprises:

Traverse multiple characteristic values of each data indicator;

Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;

Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;

The saturation of the data index is calculated according to the missing rate and the repetition rate.
The method for predicting user level based on artificial intelligence according to claim 5, wherein said calculating the relevance of each data indicator comprises:

Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;

Generating a level label vector according to the user level labels of the multiple historical users;

Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;

The Pearson coefficient is determined as the correlation degree of the data index.
The artificial intelligence-based user level prediction method according to claim 6, wherein the multiple data indicators of each user are extracted from the multiple data indicators of each user according to the saturation and the correlation. include:

Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;

Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;

Derive multiple high-level data indicators according to the multiple second data indicators;

The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
An artificial intelligence-based user level prediction device, wherein the device includes:

A calculation module for obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

An extracting module, configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

The input module is used to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, wherein the preset neural network framework also includes multiple Fully connected layer and the last output layer;

The grouping module is used to obtain all the nodes of the current fully connected layer, group all the nodes of the current fully connected layer according to preset grouping rules and determine the target node in each grouping, and use the current layer fully connected The multiple target nodes of the layer perform fully-connected training on the next fully-connected layer of the current layer until the fully-connected training of the last fully-connected layer is completed;

A training module, configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The prediction module is configured to use the user level prediction model to predict the level of the target user.
A terminal, wherein the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:

Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;

Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;

Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The user level prediction model is used to predict the level of the target user.
The terminal according to claim 9, wherein the processor executes the computer-readable instructions to obtain all nodes of the current layer fully connected layer, and perform grouping on all nodes of the current layer fully connected layer according to preset grouping rules. When grouping and determining the target node in each grouping, it specifically includes:

Get the first node value of all nodes in the current fully connected layer;

Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;

Use the target node to perform fully connected training on the next fully connected layer of the current layer;

Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;

For each trial group, calculate the loss entropy between the first node value and the second node value;

The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
9. The terminal according to claim 9, wherein when the processor executes the computer-readable instructions to implement the level prediction of the target user using the user level prediction model, it specifically includes:

Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;

Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;

Comparing the predicted probability with the target probability threshold;

When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;

When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
The terminal according to claim 9, wherein when the processor executes the computer-readable instructions to calculate the recall rate of the user level prediction model and determines the target probability threshold according to the recall rate, the specific steps include:

Use the difference method to define multiple candidate probability thresholds;

For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;

The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
The terminal according to any one of claims 9 to 12, wherein when the processor executes the computer-readable instruction to calculate the saturation of each data indicator, it specifically includes:

Traverse multiple characteristic values of each data indicator;

Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;

Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;

The saturation of the data index is calculated according to the missing rate and the repetition rate.
The terminal according to claim 13, wherein, when the processor executes the computer-readable instruction to calculate the correlation degree of each data indicator, it specifically includes:

Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;

Generating a level label vector according to the user level labels of the multiple historical users;

Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;

The Pearson coefficient is determined as the correlation degree of the data index.
The terminal according to claim 14, wherein the processor executes the computer-readable instructions to extract multiple data indicators from the multiple data indicators of each user according to the saturation and the correlation. When entering the model data indicators, specifically include:

Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;

Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;

Derive multiple high-level data indicators according to the multiple second data indicators;

The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
A computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions implement the following steps when executed by a processor:

Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;

Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;

Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;

Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;

Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;

The user level prediction model is used to predict the level of the target user.
The computer-readable storage medium according to claim 16, wherein the computer-readable instructions are executed by the processor to obtain all nodes of the current layer fully connected layer, and perform grouping of all nodes of the current layer according to preset grouping rules. When all the nodes in the connection layer are grouped and the target node in each group is determined, the details include:

Get the first node value of all nodes in the current fully connected layer;

Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;

Use the target node to perform fully connected training on the next fully connected layer of the current layer;

Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;

For each trial group, calculate the loss entropy between the first node value and the second node value;

The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
15. The computer-readable storage medium according to claim 16, wherein when the computer-readable instructions are executed by the processor to implement the level prediction of the target user using the user level prediction model, it specifically comprises:

Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;

Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;

Comparing the predicted probability with the target probability threshold;

When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;

When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
The computer-readable storage medium of claim 16, wherein the computer-readable instructions are executed by the processor to calculate the recall rate of the user level prediction model and determine the target probability threshold according to the recall rate. , Specifically including:

Use the difference method to define multiple candidate probability thresholds;

For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;

The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
The computer-readable storage medium according to any one of claims 16 to 19, wherein, when the computer-readable instruction is executed by the processor to calculate the saturation of each data indicator, it specifically includes:

Traverse multiple characteristic values of each data indicator;

Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;

Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;

The saturation of the data index is calculated according to the missing rate and the repetition rate.