WO2021139432A1 - Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium - Google Patents

Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium Download PDF

Info

Publication number
WO2021139432A1
WO2021139432A1 PCT/CN2020/131955 CN2020131955W WO2021139432A1 WO 2021139432 A1 WO2021139432 A1 WO 2021139432A1 CN 2020131955 W CN2020131955 W CN 2020131955W WO 2021139432 A1 WO2021139432 A1 WO 2021139432A1
Authority
WO
WIPO (PCT)
Prior art keywords
fully connected
target
user
layer
nodes
Prior art date
Application number
PCT/CN2020/131955
Other languages
French (fr)
Chinese (zh)
Other versions
WO2021139432A9 (en
Inventor
吴志成
张莉
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021139432A1 publication Critical patent/WO2021139432A1/en
Publication of WO2021139432A9 publication Critical patent/WO2021139432A9/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a method, device, terminal, and medium for user level prediction based on artificial intelligence.
  • Grade is an indicator used by insurance companies to assess insurance agents. It is assessed at the beginning of each month based on the performance of the insurance agent in the previous month. If the insurance agent can predict whether the insurance agent can be raised by one level in the next month based on the performance of the insurance agent in the current month, it will not only increase the enthusiasm of the insurance agent, but also help insurance companies plan their overall sales targets and improve the overall performance of the insurance company.
  • a machine learning model is trained to predict whether a user can increase a level, for example, predict whether a non-diamond-level insurance agent can be upgraded to a diamond-level insurance agent.
  • the inventor found in the process of realizing this application that there are as many as tens of thousands of data indicators for users. Using so many data indicators to train the machine learning model results in longer training time and lower user level prediction efficiency; and some useless Data indicators will also affect the learning accuracy of the machine learning model, resulting in poor user level prediction.
  • an artificial intelligence-based user level prediction method device, terminal, and medium, which can improve the efficiency of user level prediction and the accuracy of user level prediction.
  • the first aspect of the present application provides an artificial intelligence-based user level prediction method, the method includes:
  • the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
  • the user level prediction model is used to predict the level of the target user.
  • a second aspect of the present application provides an artificial intelligence-based user level prediction device, which includes:
  • a calculation module for obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
  • An extracting module configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
  • the input module is used to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, wherein the preset neural network framework also includes multiple Fully connected layer and the last output layer;
  • the grouping module is used to obtain all the nodes of the current fully connected layer, group all the nodes of the current fully connected layer according to preset grouping rules and determine the target node in each grouping, and use the current layer fully connected
  • the multiple target nodes of the layer perform fully-connected training on the next fully-connected layer of the current layer until the fully-connected training of the last fully-connected layer is completed;
  • a training module configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model;
  • the prediction module is configured to use the user level prediction model to predict the level of the target user.
  • a third aspect of the present application provides a terminal, the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
  • the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
  • the user level prediction model is used to predict the level of the target user.
  • a fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
  • the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
  • the user level prediction model is used to predict the level of the target user.
  • the artificial intelligence-based user level prediction method, device, terminal, and medium described in this application calculates each user level after obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users.
  • the saturation and relevance of each data indicator and extract multiple data indicators from multiple data indicators based on the saturation and relevance.
  • each layer is gradually obtained Connect all nodes of the layer and group all nodes according to preset grouping rules, determine the target node in each group, use multiple target nodes of each fully connected layer to perform fully connected training of the next fully connected layer, pass Grouping reduces the number of nodes participating in the transfer in each fully connected layer, reduces the amount of calculation of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the output of the last layer of the preset neural network framework Predict the grade label, iteratively train the preset neural network framework according to the predicted grade label to obtain the user grade prediction model; use the user grade prediction model to predict the target user grade, which can improve the accuracy of user grade prediction.
  • Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application.
  • Fig. 2 is a structural diagram of an artificial intelligence-based user level prediction device provided in the second embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a terminal provided in Embodiment 3 of the present application.
  • the artificial intelligence-based user level prediction method is executed by the terminal, and accordingly, the artificial intelligence-based user level prediction device runs in the terminal.
  • Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application.
  • the artificial intelligence-based user level prediction method specifically includes the following steps. According to different needs, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
  • the multiple data indicators may include, but are not limited to: age, gender, income performance, burying behavior, etc.
  • Each user corresponds to multiple data indicators and user level labels.
  • the user may be an insurance agent or a company salesperson, etc.
  • the user level label may include: a first level and a second level.
  • the first level is higher than the second level.
  • the first grade is a diamond grade
  • the second grade is a non-diamond grade.
  • the calculating the saturation of each data indicator includes:
  • the saturation of the data index is calculated according to the missing rate and the repetition rate.
  • the characteristic value of the same data indicator for different users may be different or the same.
  • the feature value of the gender data indicator of some users is female, and the feature value of the gender data indicator of some users is male.
  • the age data index, the characteristic value of the user's age data index can be distributed between 18-60 years old.
  • the first product between the missing rate and the preset first weight calculates the second product between the repetition rate and the preset second weight; finally calculate the first product
  • the sum between the product and the second product gets the saturation of the data index.
  • the sum of the preset first weight and the preset second weight is 1, and the preset first weight is less than the preset second weight.
  • the calculating the correlation of each data indicator includes:
  • the Pearson coefficient is determined as the correlation degree of the data index.
  • the extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:
  • the multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
  • the neural network cannot learn the characteristics of the corresponding data indicators. Therefore, removing multiple data indicators corresponding to saturations less than or equal to the preset saturation threshold and retaining multiple data indicators corresponding to saturations greater than the preset saturation threshold not only reduces the data entered into the model of the user level prediction model The number of indicators, and the removal of useless data indicators to reduce noise data will help improve the learning effect of the user level prediction model.
  • the age data indicators or gender data indicators have no learning significance for the user level prediction model. Therefore, removing multiple data indicators corresponding to correlations greater than or equal to the preset correlation threshold, and retaining multiple data indicators corresponding to saturations less than the preset saturation threshold can not only further reduce the risk of user level prediction models.
  • the number of data indicators can further reduce the noise data, which helps to improve the learning effect of the user level prediction model.
  • S13 Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.
  • a convolutional neural network can be acquired as a preset neural network framework.
  • the convolutional neural network includes the first input layer, multiple fully connected layers, and the last output layer.
  • the first input layer is connected to the first fully connected layer
  • the first fully connected layer is connected to the second fully connected layer
  • the second fully connected layer is connected to the third fully connected layer, and so on, the last layer
  • the fully connected layer connects the last output layer.
  • the number of fully connected layers can be set according to actual conditions.
  • a data set can be constructed based on the data pairs of multiple users, and the data set is divided into divisions according to the user’s entry time Training data set and test data set. Input the training data set to the first input layer in the preset neural network framework for learning and training.
  • the number of nodes in the same group can be selected according to the calculation speed of the user level prediction model.
  • the calculation is performed with the preset weight and passed to the next fully connected layer; for non-target nodes, the calculation is not performed with the preset weight, and the calculation is not passed to the next fully connected layer.
  • the acquiring all nodes of the current fully connected layer, grouping all the nodes of the current fully connected layer according to a preset grouping rule, and determining the target node in each group includes:
  • the heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all the nodes in the current fully connected layer.
  • Step by step test grouping can be carried out according to the positions of all nodes in each fully connected layer. First use the stepwise trial grouping method to group the first fully connected layer, then use the stepwise trial grouping method to group the second layer fully connected layer, and so on, and finally use the stepwise trial grouping method to group the last fully connected layer Grouping.
  • the first trial grouping is to divide the two nodes that are next to each other into the same group
  • the second trial grouping is to divide the three nodes that are next to each other into the same group
  • the N+1 nodes that are next to each other are divided into the same group.
  • the distance between the first node value of the current fully connected layer and the second node value of the next fully connected layer of the current fully connected layer is calculated to obtain the loss entropy.
  • the larger the loss entropy it means that all the nodes after the grouping are passed to the next fully connected layer, resulting in more feature loss, and the smaller the loss entropy is, indicating that all the nodes after the grouping are passed to the next fully connected layer. This results in less feature loss.
  • the target heuristic grouping method corresponding to the smallest loss entropy as the target heuristic grouping method, and use the target heuristic grouping method to group all the nodes in the current fully connected layer, which can effectively guarantee all the nodes after the grouping After being passed to the next fully connected layer, the features will not be lost; and the number of target nodes participating in the transfer is greatly reduced compared with the number of nodes before grouping, which can reduce the amount of calculations in each fully connected layer and improve the entire The training speed and efficiency of the user level prediction model.
  • S15 Obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.
  • the test pass rate is calculated according to the predicted level label and user level label.
  • the data set is re-divided into a new training data set and a new test data set, and the new training is used
  • the data set retrains the user level prediction model, and uses the new test data set to retest the test pass rate of the user level prediction model until the test pass rate is greater than the preset pass rate threshold, and the user level prediction model training is notified.
  • S16 Use the user level prediction model to predict the level of the target user.
  • the target data index corresponding to the model data index is obtained from the multiple data indicators of the target user in the current month, and the characteristic value of the target data index is output
  • the probability value is obtained by prediction in the user level prediction model. The probability value is used to indicate the possibility that the target user can be promoted to the first level in the next month of the current month.
  • the using the user level prediction model to perform level prediction on the target user includes:
  • the target probability threshold When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
  • the target probability threshold is calculated according to the recall rate of the user level prediction model.
  • the higher the predicted probability output by the user level prediction model the greater the probability that the target user can be promoted to the first level (for example, the greater the probability of being able to be promoted); the predicted probability output by the user level prediction model The smaller the value, the greater the probability that the target user cannot be promoted to the first level (for example, the less likely it is that the target user can be promoted).
  • the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate includes:
  • the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
  • the candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
  • the recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive.
  • Recall TP/(TP+FN)
  • the process of determining the target probability threshold is as follows:
  • the candidate probability threshold of 0.2 compare the predicted probability value output by the user level prediction model in the training phase with the subsequent probability threshold 0.2.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold 0.2, then the predicted level label is obtained. It is the first level, and the predicted probability value output in the training phase is less than the subsequent probability threshold 0.2, then the predicted level label is obtained as the second level, and the first recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
  • the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.3.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.3, and the predicted level label is
  • the predicted probability value output in the training phase is less than the subsequent probability threshold of 0.3, then the predicted level label is obtained as the second level, and the second recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
  • the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.9.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.9, then the predicted level label is
  • the predicted level label is obtained as the second level, and the ninth recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label.
  • the nine recall rates are compared, and the candidate probability threshold corresponding to the largest recall rate is selected as the target probability threshold.
  • the target probability threshold is determined by the maximum recall rate of the user level prediction model, which can more accurately predict the user level.
  • the above-mentioned user-level prediction model can be stored in a node of the blockchain.
  • the artificial intelligence-based user level prediction method described in this application can be applied to smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules.
  • the fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework
  • the prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.
  • Fig. 2 is a structural diagram of an artificial intelligence-based user level prediction device provided in the second embodiment of the present application.
  • the artificial intelligence-based user level prediction device 20 may include multiple functional modules composed of computer-readable instruction segments.
  • the computer-readable instructions of each program segment in the artificial intelligence-based user level prediction device 20 may be stored in the memory of the terminal and executed by at least one processor to execute (see FIG. 1 for details). The function of user level prediction.
  • the artificial intelligence-based user level prediction device 20 can be divided into multiple functional modules according to the functions it performs.
  • the functional modules may include: a calculation module 201, an extraction module 202, an input module 203, a grouping module 204, a training module 205, and a prediction module 206.
  • the module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
  • the calculation module 201 is configured to obtain multiple data indicators of multiple users and obtain user level labels of the multiple users, and calculate the saturation of each data indicator and the correlation degree of each data indicator.
  • the multiple data indicators may include, but are not limited to: age, gender, income performance, burying behavior, etc.
  • Each user corresponds to multiple data indicators and user level labels.
  • the user may be an insurance agent or a company salesperson, etc.
  • the user level label may include: a first level and a second level.
  • the first level is higher than the second level.
  • the first grade is a diamond grade
  • the second grade is a non-diamond grade.
  • the calculation module 201 calculating the saturation of each data indicator includes:
  • the saturation of the data index is calculated according to the missing rate and the repetition rate.
  • the characteristic value of the same data indicator for different users may be different or the same.
  • the feature value of the gender data indicator of some users is female, and the feature value of the gender data indicator of some users is male.
  • the age data index, the characteristic value of the user's age data index can be distributed between 18-60 years old.
  • the first product between the missing rate and the preset first weight calculates the second product between the repetition rate and the preset second weight; finally calculate the first product
  • the sum between the product and the second product gets the saturation of the data index.
  • the sum of the preset first weight and the preset second weight is 1, and the preset first weight is less than the preset second weight.
  • the calculation module 201 calculating the correlation degree of each data indicator includes:
  • the Pearson coefficient is determined as the correlation degree of the data index.
  • the extracting module 202 is configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation.
  • the extracting module 202 extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:
  • the multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
  • the neural network cannot learn the characteristics of the corresponding data indicators. Therefore, removing multiple data indicators corresponding to saturations less than or equal to the preset saturation threshold and retaining multiple data indicators corresponding to saturations greater than the preset saturation threshold not only reduces the data entered into the model of the user level prediction model The number of indicators, and the removal of useless data indicators to reduce noise data will help improve the learning effect of the user level prediction model.
  • the age data indicators or gender data indicators have no learning significance for the user level prediction model. Therefore, removing multiple data indicators corresponding to correlations greater than or equal to the preset correlation threshold, and retaining multiple data indicators corresponding to saturations less than the preset saturation threshold can not only further reduce the risk of user level prediction models.
  • the number of data indicators can further reduce the noise data, which helps to improve the learning effect of the user level prediction model.
  • the input module 203 is configured to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.
  • a convolutional neural network can be acquired as a preset neural network framework.
  • the convolutional neural network includes the first input layer, multiple fully connected layers, and the last output layer.
  • the first input layer is connected to the first fully connected layer
  • the first fully connected layer is connected to the second fully connected layer
  • the second fully connected layer is connected to the third fully connected layer, and so on, the last layer
  • the fully connected layer connects the last output layer.
  • the number of fully connected layers can be set according to actual conditions.
  • a data set can be constructed based on the data pairs of multiple users, and the data set is divided into divisions according to the user’s entry time Training data set and test data set. Input the training data set to the first input layer in the preset neural network framework for learning and training.
  • the grouping module 204 is configured to obtain all nodes of the current fully connected layer, group all nodes of the current fully connected layer according to preset grouping rules, determine the target node in each group, and use the current The multiple target nodes of the fully connected layer perform fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed.
  • the number of nodes in the same group can be selected according to the calculation speed of the user level prediction model.
  • the calculation is performed with the preset weight and passed to the next fully connected layer; for non-target nodes, the calculation is not performed with the preset weight, and the calculation is not passed to the next fully connected layer.
  • the grouping module 204 obtains all nodes of the current fully connected layer, groups all nodes of the current fully connected layer according to preset grouping rules, and determines the target in each group
  • the nodes include:
  • the heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all the nodes in the current fully connected layer.
  • Step by step test grouping can be carried out according to the positions of all nodes in each fully connected layer. First use the stepwise trial grouping method to group the first fully connected layer, then use the stepwise trial grouping method to group the second layer fully connected layer, and so on, and finally use the stepwise trial grouping method to group the last fully connected layer Grouping.
  • the first trial grouping is to divide the two nodes that are next to each other into the same group
  • the second trial grouping is to divide the three nodes that are next to each other into the same group
  • the N+1 nodes that are next to each other are divided into the same group.
  • the distance between the first node value of the current fully connected layer and the second node value of the next fully connected layer of the current fully connected layer is calculated to obtain the loss entropy.
  • the larger the loss entropy it means that all the nodes after the grouping are passed to the next fully connected layer, resulting in more feature loss, and the smaller the loss entropy is, indicating that all the nodes after the grouping are passed to the next fully connected layer. This results in less feature loss.
  • the target heuristic grouping method corresponding to the smallest loss entropy as the target heuristic grouping method, and use the target heuristic grouping method to group all the nodes in the current fully connected layer, which can effectively guarantee all the nodes after the grouping After being passed to the next fully connected layer, the features will not be lost; and the number of target nodes participating in the transfer is greatly reduced compared with the number of nodes before grouping, which can reduce the amount of calculations in each fully connected layer and improve the entire The training speed and efficiency of the user level prediction model.
  • the training module 205 is configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.
  • the test pass rate is calculated according to the predicted level label and user level label.
  • the data set is re-divided into a new training data set and a new test data set, and the new training is used
  • the data set retrains the user level prediction model, and uses the new test data set to retest the test pass rate of the user level prediction model until the test pass rate is greater than the preset pass rate threshold, and the user level prediction model training is notified.
  • the prediction module 206 is configured to use the user level prediction model to predict the level of the target user.
  • the target data index corresponding to the model data index is obtained from the multiple data indicators of the target user in the current month, and the characteristic value of the target data index is output
  • the probability value is obtained by prediction in the user level prediction model. The probability value is used to indicate the possibility that the target user can be promoted to the first level in the next month of the current month.
  • the prediction module 206 using the user level prediction model to perform level prediction on the target user includes:
  • the target probability threshold When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
  • the target probability threshold is calculated according to the recall rate of the user level prediction model.
  • the higher the predicted probability output by the user level prediction model the greater the probability that the target user can be promoted to the first level (for example, the greater the probability of being able to be promoted); the predicted probability output by the user level prediction model The smaller the value, the greater the probability that the target user cannot be promoted to the first level (for example, the less likely it is that the target user can be promoted).
  • the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate includes:
  • the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
  • the candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
  • the recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive.
  • the process of determining the target probability threshold is as follows:
  • the candidate probability threshold of 0.2 compare the predicted probability value output by the user level prediction model in the training phase with the subsequent probability threshold 0.2.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold 0.2, then the predicted level label is obtained. It is the first level, and the predicted probability value output in the training phase is less than the subsequent probability threshold 0.2, then the predicted level label is obtained as the second level, and the first recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
  • the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.3.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.3, and the predicted level label is
  • the predicted probability value output in the training phase is less than the subsequent probability threshold of 0.3, then the predicted level label is obtained as the second level, and the second recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
  • the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.9.
  • the predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.9, then the predicted level label is
  • the predicted level label is obtained as the second level, and the ninth recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label.
  • the nine recall rates are compared, and the candidate probability threshold corresponding to the largest recall rate is selected as the target probability threshold.
  • the target probability threshold is determined by the maximum recall rate of the user level prediction model, which can more accurately predict the user level.
  • the above-mentioned user-level prediction model can be stored in a node of the blockchain.
  • the artificial intelligence-based user level prediction device described in this application can be used in smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules.
  • the fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework
  • the prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.
  • the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
  • the structure of the terminal shown in FIG. 3 does not constitute a limitation of the embodiments of the present application. It may be a bus-type structure or a star structure.
  • the terminal 3 may also include more than that shown in the figure. More or less other hardware or software, or different component arrangements.
  • the terminal 3 is a terminal that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
  • Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit, and Programming gate arrays, digital processors and embedded devices, etc.
  • the terminal 3 may also include client equipment.
  • the client equipment includes, but is not limited to, any electronic product that can interact with the client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer. Computers, tablets, smart phones, digital cameras, etc.
  • terminal 3 is only an example. If other existing or future electronic products can be adapted to this application, they should also be included in the protection scope of this application and included here by reference.
  • computer-readable instructions are stored in the memory 31, and when the computer-readable instructions are executed by the at least one processor 32, all of the aforementioned artificial intelligence-based user level prediction methods are implemented. Or part of the steps.
  • the memory 31 includes volatile and non-volatile memory, such as random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only).
  • PROM Erasable Programmable Read-Only Memory
  • EPROM Erasable Programmable Read-Only Memory
  • OTPROM Electronic Erasable Programmable Read-Only Memory
  • EEPROM Electrically-Erasable Programmable Read-Only Memory
  • CD-ROM Compact Disc Read-Only Memory
  • the computer-readable storage medium may be non-volatile or volatile.
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the at least one processor 32 is the control core (Control Unit) of the terminal 3.
  • Various interfaces and lines are used to connect various components of the entire terminal 3, and are stored in the memory 31 through operation or execution.
  • the programs or modules in the terminal 3 and the data stored in the memory 31 are called to execute various functions of the terminal 3 and process data.
  • the at least one processor 32 executes the computer-readable instructions stored in the memory, all or part of the steps of the artificial intelligence-based user level prediction method described in the embodiments of the present application are implemented; or the artificial intelligence-based All or part of the functions of the user level prediction device.
  • the at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units. (Central Processing unit, CPU), a combination of microprocessors, digital processing chips, graphics processors, and various control chips.
  • CPU Central Processing unit
  • the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
  • the terminal 3 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 32 through a power management device, so as to realize management through the power management device. Functions such as charging, discharging, and power management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the terminal 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium.
  • the above-mentioned software function module is stored in a storage medium and includes several instructions to make a terminal (which may be a personal computer, a terminal, or a network device, etc.) or a processor execute part of the method described in each embodiment of the present application .
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present application relates to the technical field of artificial intelligence, and provides an artificial intelligence-based user rating prediction method and apparatus, a terminal, and a medium. The method comprises: calculating a saturation and a correlation of each data indicator; extracting a plurality of modelling data indicators from among the plurality of data indicators according to the saturations and the correlations, and inputting into a first input layer in a preset neural network framework; grouping all nodes of a current fully connected layer according to a preset grouping rule, determining a target node in each group, and using the plurality of target nodes of the current fully connected layer to perform fully connected training on a next fully connected layer, until training of a last fully connected layer is complete; iteratively training the preset neural network framework according to a predicted rating label outputted by a last output layer of the preset neural network framework, to obtain a user rating prediction model; using the user rating prediction model to perform rating prediction on a target user. The present application can increase the efficiency of user rating prediction, and improve the accuracy of user rating prediction.

Description

基于人工智能的用户等级预测方法、装置、终端及介质Artificial intelligence-based user level prediction method, device, terminal and medium
本申请要求于2020年10月13日提交中国专利局、申请号为202011092932.5,发明名称为“基于人工智能的用户等级预测方法、装置、终端及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on October 13, 2020, the application number is 202011092932.5, and the invention title is "artificial intelligence-based user level prediction method, device, terminal, and medium", and its entire content Incorporated in this application by reference.
技术领域Technical field
本申请涉及人工智能技术领域,具体涉及一种基于人工智能的用户等级预测方法、装置、终端及介质。This application relates to the field of artificial intelligence technology, and in particular to a method, device, terminal, and medium for user level prediction based on artificial intelligence.
背景技术Background technique
等级是保险公司用于考核保险代理人的一项指标,在每个月月初根据保险代理人上个月的业绩进行评定。若能根据保险代理人当月的业绩预测出保险代理人在下个月能否升高一个等级,不仅能够提升保险代理人的积极性,还能够帮助保险公司规划整体销售目标,提升保险公司的整体业绩。Grade is an indicator used by insurance companies to assess insurance agents. It is assessed at the beginning of each month based on the performance of the insurance agent in the previous month. If the insurance agent can predict whether the insurance agent can be raised by one level in the next month based on the performance of the insurance agent in the current month, it will not only increase the enthusiasm of the insurance agent, but also help insurance companies plan their overall sales targets and improve the overall performance of the insurance company.
现有技术中,通过训练机器学习模型来预测用户能否升高一个等级,例如,预测非钻石级保险代理人能否升级为钻石级保险代理人。然而发明人在实现本申请的过程中发现,用户的数据指标多达上万个,使用如此多的数据指标训练机器学习模型,导致训练时间较长,用户等级预测效率较低;且一些无用的数据指标还会影响机器学习模型的学习精度,导致用户等级预测效果较差。In the prior art, a machine learning model is trained to predict whether a user can increase a level, for example, predict whether a non-diamond-level insurance agent can be upgraded to a diamond-level insurance agent. However, the inventor found in the process of realizing this application that there are as many as tens of thousands of data indicators for users. Using so many data indicators to train the machine learning model results in longer training time and lower user level prediction efficiency; and some useless Data indicators will also affect the learning accuracy of the machine learning model, resulting in poor user level prediction.
发明内容Summary of the invention
鉴于以上内容,有必要提出一种基于人工智能的用户等级预测方法、装置、终端及介质,能够提高用户等级预测的效率及提高用户等级预测的准确率。In view of the above content, it is necessary to propose an artificial intelligence-based user level prediction method, device, terminal, and medium, which can improve the efficiency of user level prediction and the accuracy of user level prediction.
本申请的第一方面提供一种基于人工智能的用户等级预测方法,所述方法包括:The first aspect of the present application provides an artificial intelligence-based user level prediction method, the method includes:
获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
本申请的第二方面提供一种基于人工智能的用户等级预测装置,所述装置包括:A second aspect of the present application provides an artificial intelligence-based user level prediction device, which includes:
计算模块,用于获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;A calculation module for obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
提取模块,用于根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;An extracting module, configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
输入模块,用于输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;The input module is used to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, wherein the preset neural network framework also includes multiple Fully connected layer and the last output layer;
分组模块,用于获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;The grouping module is used to obtain all the nodes of the current fully connected layer, group all the nodes of the current fully connected layer according to preset grouping rules and determine the target node in each grouping, and use the current layer fully connected The multiple target nodes of the layer perform fully-connected training on the next fully-connected layer of the current layer until the fully-connected training of the last fully-connected layer is completed;
训练模块,用于获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;A training module, configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model;
预测模块,用于使用所述用户等级预测模型对目标用户进行等级预测。The prediction module is configured to use the user level prediction model to predict the level of the target user.
本申请的第三方面提供一种终端,所述终端包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现以下步骤:A third aspect of the present application provides a terminal, the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
本申请的第四方面提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:A fourth aspect of the present application provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented:
获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
综上所述,本申请所述的基于人工智能的用户等级预测方法、装置、终端及介质,在获取多个用户的多个数据指标及获取所述多个用户的用户等级标签之后,计算每个数 据指标的饱和度及相关度,并根据饱和度和相关度从多个数据指标中提取出多个入模数据指标,由于多个入模数据指标的数据量远小于多个数据指标,使得基于多个入模数据指标训练用户等级预测模型的效率得以提高;通过将多个入模数据指标及对应的用户等级标签输入至预设神经网络框架中的第一层输入层,逐步获取每层连接层的所有节点并按照预设分组规则对所有节点进行分组,确定每个分组中的目标节点,使用每层全连接层的多个目标节点进行下一层全连接层的全连接训练,通过分组减少了每层全连接层中参与传递的节点的数量,减少了神经网络的计算量,进一步提高了训练用户等级预测模型的效率;最后获取预设神经网络框架的最后一层输出层输出的预测等级标签,根据预测等级标签迭代训练预设神经网络框架得到用户等级预测模型;使用用户等级预测模型对目标用户进行等级预测,能够提高用户等级预测的准确率。In summary, the artificial intelligence-based user level prediction method, device, terminal, and medium described in this application calculates each user level after obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users. The saturation and relevance of each data indicator, and extract multiple data indicators from multiple data indicators based on the saturation and relevance. Since the data volume of multiple data indicators is much smaller than that of multiple data indicators, The efficiency of training user level prediction models based on multiple entry data indicators is improved; by inputting multiple entry data indicators and corresponding user level labels into the first input layer of the preset neural network framework, each layer is gradually obtained Connect all nodes of the layer and group all nodes according to preset grouping rules, determine the target node in each group, use multiple target nodes of each fully connected layer to perform fully connected training of the next fully connected layer, pass Grouping reduces the number of nodes participating in the transfer in each fully connected layer, reduces the amount of calculation of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the output of the last layer of the preset neural network framework Predict the grade label, iteratively train the preset neural network framework according to the predicted grade label to obtain the user grade prediction model; use the user grade prediction model to predict the target user grade, which can improve the accuracy of user grade prediction.
附图说明Description of the drawings
图1是本申请实施例一提供的基于人工智能的用户等级预测方法的流程图。Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application.
图2是本申请实施例二提供的基于人工智能的用户等级预测装置的结构图。Fig. 2 is a structural diagram of an artificial intelligence-based user level prediction device provided in the second embodiment of the present application.
图3是本申请实施例三提供的终端的结构示意图。FIG. 3 is a schematic structural diagram of a terminal provided in Embodiment 3 of the present application.
具体实施方式Detailed ways
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施例对本申请进行详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。In order to be able to understand the above objectives, features and advantages of the application more clearly, the application will be described in detail below with reference to the accompanying drawings and specific embodiments. It should be noted that the embodiments of the application and the features in the embodiments can be combined with each other if there is no conflict.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中在本申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by those skilled in the technical field of this application. The terms used in the specification of the application herein are only for the purpose of describing specific embodiments, and are not intended to limit the application.
基于人工智能的用户等级预测方法由终端执行,相应地,基于人工智能的用户等级预测装置运行于终端中。The artificial intelligence-based user level prediction method is executed by the terminal, and accordingly, the artificial intelligence-based user level prediction device runs in the terminal.
图1是本申请实施例一提供的基于人工智能的用户等级预测方法的流程图。所述基于人工智能的用户等级预测方法具体包括以下步骤,根据不同的需求,该流程图中步骤的顺序可以改变,某些可以省略。Fig. 1 is a flowchart of a method for predicting a user level based on artificial intelligence provided in Embodiment 1 of the present application. The artificial intelligence-based user level prediction method specifically includes the following steps. According to different needs, the order of the steps in the flowchart can be changed, and some of the steps can be omitted.
S11,获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度。S11. Obtain multiple data indicators of multiple users and obtain user level labels of the multiple users, and calculate the saturation of each data indicator and calculate the correlation degree of each data indicator.
其中,所述多个数据指标可以包括,但不限于:年龄、性别、收入业绩、埋点行为等。每个用户都对应多个数据指标及用户等级标签。Wherein, the multiple data indicators may include, but are not limited to: age, gender, income performance, burying behavior, etc. Each user corresponds to multiple data indicators and user level labels.
其中,所述用户可以是保险代理人或者公司销售人员等,所述用户等级标签可以包括:第一等级和第二等级。其中,所述第一等级高于所述第二等级。示例性的,第一等级为钻石级,第二等级为非钻石级。Wherein, the user may be an insurance agent or a company salesperson, etc., and the user level label may include: a first level and a second level. Wherein, the first level is higher than the second level. Exemplarily, the first grade is a diamond grade, and the second grade is a non-diamond grade.
由于每个用户的数据指标可能多达上千甚至上万个,使用如此多个数据指标训练用户等级预测模型,会导致用户等级预测模型的训练时间较长,因此,通过计算每个数据指标的饱和度及计算每个数据指标的相关度,来筛选出适合入模的数据指标,以此降低用户等级预测模型的训练时间,提高用户等级预测模型的训练效率,从而提高用户等级预测的效率。Since there may be thousands or even tens of thousands of data indicators for each user, using such multiple data indicators to train the user level prediction model will lead to a longer training time for the user level prediction model. Therefore, by calculating the value of each data indicator Saturation and calculation of the correlation of each data indicator to filter out the data indicators suitable for the model, so as to reduce the training time of the user level prediction model, improve the training efficiency of the user level prediction model, and thereby improve the efficiency of user level prediction.
在一个可选的实施例中,所述计算每个数据指标的饱和度包括:In an optional embodiment, the calculating the saturation of each data indicator includes:
遍历每个数据指标的多个特征值;Traverse multiple characteristic values of each data indicator;
计算所述多个特征值中与预设特征值匹配的特征值的第一数量,根据所述第一数量计算所述数据指标的缺失率;Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;
计算所述多个特征值中具有相同特征值的第二数量,根据所述第二数量计算所述数 据指标的重复率;Calculating a second number of the multiple characteristic values having the same characteristic value, and calculating the repetition rate of the data index according to the second number;
根据所述缺失率和所述重复率计算所述数据指标的饱和度。The saturation of the data index is calculated according to the missing rate and the repetition rate.
不同的用户的同一个数据指标的特征值可能不同,也可能相同。例如,性别数据指标,部分用户的性别数据指标的特征值为女,部分用户的性别数据指标的特征值为男。又如,年龄数据指标,用户的年龄数据指标的特征值可以分布在18-60岁之间。The characteristic value of the same data indicator for different users may be different or the same. For example, for the gender data indicator, the feature value of the gender data indicator of some users is female, and the feature value of the gender data indicator of some users is male. For another example, the age data index, the characteristic value of the user's age data index can be distributed between 18-60 years old.
通常而言,有多少用户,同一个数据指标的特征值的数量应该为多少。然而,实际应用中,采集用户的数据时会存在缺失或者遗漏现象,导致某些用户的某些数据指标的特征值为空。计算某个数据指标中空的特征值的第一数量与用户的数量的比值能够确定该数据指标的缺失率。计算某个数据指标中相同的特征值的第二数量与用户的数量的比值能够确定该数据指标的重复率。Generally speaking, how many users are there and how many feature values of the same data indicator should be. However, in practical applications, there may be missing or omissions when collecting user data, resulting in empty characteristic values of certain data indicators for some users. Calculating the ratio of the first number of empty feature values of a certain data indicator to the number of users can determine the missing rate of the data indicator. Calculating the ratio of the second number of the same feature value to the number of users in a certain data indicator can determine the repetition rate of the data indicator.
在计算出数据指标的缺失率和重复率之后,再计算缺失率与预设第一权重之间的第一乘积,计算重复率与预设第二权重之间的第二乘积;最后计算第一乘积与第二乘积之间的总和得到数据指标的饱和度。其中,预设第一权重与预设第二权重之和为1,且预设第一权重小于预设第二权重。After calculating the missing rate and repetition rate of the data indicators, calculate the first product between the missing rate and the preset first weight, and calculate the second product between the repetition rate and the preset second weight; finally calculate the first product The sum between the product and the second product gets the saturation of the data index. Wherein, the sum of the preset first weight and the preset second weight is 1, and the preset first weight is less than the preset second weight.
在一个可选实施例中,所述计算每个数据指标的相关度包括:In an optional embodiment, the calculating the correlation of each data indicator includes:
根据每个数据指标的多个特征值生成特征值向量;Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;
根据所述多个历史用户的用户等级标签生成等级标签向量;Generating a level label vector according to the user level labels of the multiple historical users;
计算所述特征值向量及所述等级标签向量之间的皮尔逊系数;Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;
确定所述皮尔逊系数为所述数据指标的相关度。The Pearson coefficient is determined as the correlation degree of the data index.
通过计算数据指标的特征值向量与等级标签向量之间的皮尔逊系数,来表示数据指标与第一等级是否相关联,便于根据皮尔逊系数选取适合入模的数据指标。By calculating the Pearson coefficient between the eigenvalue vector of the data index and the grade label vector, it is indicated whether the data index is related to the first level, and it is convenient to select the data index suitable for the model according to the Pearson coefficient.
S12,根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标。S12, extracting multiple data indicators for model entry from the multiple data indicators of each user according to the saturation and the correlation.
在一个可选的实施例中,所述根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标包括:In an optional embodiment, the extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:
从所述多个数据指标中获取大于预设饱和度阈值的饱和度对应的多个第一数据指标;Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;
从所述多个第一数据指标中获取小于预设相关度阈值的相关度对应的多个第二数据指标;Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;
根据所述多个第二数据指标衍生出多个高阶数据指标;Derive multiple high-level data indicators according to the multiple second data indicators;
将所述多个第二数据指标及所述多个高阶数据指标作为所述入模数据指标。The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
对于缺失率过大的数据指标,导致神经网络无法学习到对应的数据指标的特征。因此,去掉小于或者等于预设饱和度阈值的饱和度对应的多个数据指标,保留大于预设饱和度阈值的饱和度对应的多个数据指标,不仅减少了用户等级预测模型的入模的数据指标的数量,且去掉无用的数据指标,减少噪声数据,有助于提高用户等级预测模型的学习效果。For data indicators with a large missing rate, the neural network cannot learn the characteristics of the corresponding data indicators. Therefore, removing multiple data indicators corresponding to saturations less than or equal to the preset saturation threshold and retaining multiple data indicators corresponding to saturations greater than the preset saturation threshold not only reduces the data entered into the model of the user level prediction model The number of indicators, and the removal of useless data indicators to reduce noise data will help improve the learning effect of the user level prediction model.
假如,年龄数据指标对应的特征值全为20,或者性别数据指标对应的特征值全为女或者全为男,则年龄数据指标或者性别数据指标对于用户等级预测模型无任何学习的意义。因此,去掉大于或者等于预设相关度阈值的相关度对应的多个数据指标,保留小于预设饱和度阈值的饱和度对应的多个数据指标,不仅能够进一步减少用户等级预测模型的入模的数据指标的数量,且能够进一步减少噪声数据,有助于提高用户等级预测模型的学习效果。If the feature values corresponding to the age data indicators are all 20, or the feature values corresponding to the gender data indicators are all female or all male, then the age data indicators or gender data indicators have no learning significance for the user level prediction model. Therefore, removing multiple data indicators corresponding to correlations greater than or equal to the preset correlation threshold, and retaining multiple data indicators corresponding to saturations less than the preset saturation threshold can not only further reduce the risk of user level prediction models. The number of data indicators can further reduce the noise data, which helps to improve the learning effect of the user level prediction model.
S13,输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层。S13: Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.
可以获取卷积神经网络作为预设神经网络框架。其中,卷积神经网络包括第一层输入层,多个全连接层,最后一层输出层。所述第一层输入层连接第一层全连接层,第一层全连接层连接第二层全连接层,第二层全连接层连接第三层全连接层,以此类推,最后一层全连接层连接最后一层输出层。全连接层的数量可以依据实际情况设置。A convolutional neural network can be acquired as a preset neural network framework. Among them, the convolutional neural network includes the first input layer, multiple fully connected layers, and the last output layer. The first input layer is connected to the first fully connected layer, the first fully connected layer is connected to the second fully connected layer, the second fully connected layer is connected to the third fully connected layer, and so on, the last layer The fully connected layer connects the last output layer. The number of fully connected layers can be set according to actual conditions.
将每个用户的入模数据指标的特征值及对应的用户等级标签作为数据对,则根据多个用户的数据对能够构建数据集,将所述数据集按照用户的入司时间进行切分为训练数据集和测试数据集。输入训练数据集至预设神经网络框架中的第一层输入层中进行学习与训练。Taking the feature value of each user’s entry data indicator and the corresponding user level label as a data pair, a data set can be constructed based on the data pairs of multiple users, and the data set is divided into divisions according to the user’s entry time Training data set and test data set. Input the training data set to the first input layer in the preset neural network framework for learning and training.
S14,获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练。S14. Obtain all nodes of the current fully connected layer, group all nodes of the current fully connected layer according to preset grouping rules, determine the target node in each group, and use the multiple of the current fully connected layer. Each of the target nodes performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed.
同一分组内的节点的数量可以根据用户等级预测模型所要达到的运算速度进行选择,其中用户等级预测模型的运算速度越快,则同一分组内的节点的数量越多,用户等级预测模型的运算速度越慢,则在同一分组内的节点的数量越少。针对目标节点,与预设权重做运算,并且往下一层全连接层传递;针对非目标节点,则不与预设权重做运算,并且不往下一层全连接层传递。The number of nodes in the same group can be selected according to the calculation speed of the user level prediction model. The faster the calculation speed of the user level prediction model, the greater the number of nodes in the same group, and the calculation speed of the user level prediction model The slower, the smaller the number of nodes in the same group. For the target node, the calculation is performed with the preset weight and passed to the next fully connected layer; for non-target nodes, the calculation is not performed with the preset weight, and the calculation is not passed to the next fully connected layer.
但为了避免过度分组,导致等级预测模型的预测精度降低,需要逐步试探确定同一分组内的节点的数量,以提高用户等级预测模型的训练效率并提高等级预测模型的预测精度。However, in order to avoid excessive grouping, resulting in a decrease in the prediction accuracy of the grade prediction model, it is necessary to gradually determine the number of nodes in the same group to improve the training efficiency of the user grade prediction model and improve the prediction accuracy of the grade prediction model.
在一个可选的实施例中,所述获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点包括:In an optional embodiment, the acquiring all nodes of the current fully connected layer, grouping all the nodes of the current fully connected layer according to a preset grouping rule, and determining the target node in each group includes:
获取当前层全连接层中的所有节点的第一节点值;Get the first node value of all nodes in the current fully connected layer;
采用逐步试探分组方法对所述当前层全连接层中的所述所有节点进行分组,并将每次试探的过程中每个分组中的最大第一节点值对应的节点确定为目标节点;Grouping all the nodes in the current fully connected layer by using a step-by-step trial grouping method, and determining the node corresponding to the largest first node value in each group in the process of each trial as the target node;
获取经过全连接训练的下一层全连接层中的所有节点的第二节点值;Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;
针对每次试探分组,计算所述第一节点值与所述第二节点值之间的损失熵;For each trial group, calculate the loss entropy between the first node value and the second node value;
确定最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所述所有节点进行分组。The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all the nodes in the current fully connected layer.
可以根据每层全连接层所有节点的位置进行逐步试探分组。先采用逐步试探分组方法对第一层全连接层进行分组,再采用逐步试探分组方法对第二层全连接层进行分组,以此类推,最后采用逐步试探分组方法对最后一层全连接层进行分组。Step by step test grouping can be carried out according to the positions of all nodes in each fully connected layer. First use the stepwise trial grouping method to group the first fully connected layer, then use the stepwise trial grouping method to group the second layer fully connected layer, and so on, and finally use the stepwise trial grouping method to group the last fully connected layer Grouping.
采用逐步试探分组方法对每一层全连接层进行分组的过程如下:The process of grouping each fully connected layer with the step-by-step trial grouping method is as follows:
第一次试探分组,将紧邻在一起的两个节点分为同一个分组;The first trial grouping is to divide the two nodes that are next to each other into the same group;
第二次试探分组,将紧邻在一起的三个节点分为同一个分组;The second trial grouping is to divide the three nodes that are next to each other into the same group;
……;...
第N次试探分组,将紧邻在一起的N+1个节点分为同一个分组。In the Nth trial group, the N+1 nodes that are next to each other are divided into the same group.
每次试探分组后,计算当前层全连接层的第一节点值与当前层全连接层的下一层全连接层的第二节点值之间的距离得到损失熵。损失熵越大,表明分组后的所有节点在传递给下一层全连接层之后,导致特征损失较多,损失熵越小,表明分组后的所有节点在传递给下一层全连接层之后,导致特征损失较少。After each trial grouping, the distance between the first node value of the current fully connected layer and the second node value of the next fully connected layer of the current fully connected layer is calculated to obtain the loss entropy. The larger the loss entropy, it means that all the nodes after the grouping are passed to the next fully connected layer, resulting in more feature loss, and the smaller the loss entropy is, indicating that all the nodes after the grouping are passed to the next fully connected layer. This results in less feature loss.
选取最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所述所有节点进行分组,能够有效的保证分组后的所有节点在传递给下一层全连接层之后,特征不会损失;且参与传递的目标节点的数量 相较于未分组前的节点的数量大大降低,能够减少每层全连接层的运算量,提高整个用户等级预测模型的训练速度和效率。Select the heuristic grouping method corresponding to the smallest loss entropy as the target heuristic grouping method, and use the target heuristic grouping method to group all the nodes in the current fully connected layer, which can effectively guarantee all the nodes after the grouping After being passed to the next fully connected layer, the features will not be lost; and the number of target nodes participating in the transfer is greatly reduced compared with the number of nodes before grouping, which can reduce the amount of calculations in each fully connected layer and improve the entire The training speed and efficiency of the user level prediction model.
S15,获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型。S15: Obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.
根据预测等级标签与用户等级标签计算得到测试通过率,当测试通过率小于预设通过率阈值时,将数据集重新切分为新的训练数据集和新的测试数据集,并使用新的训练数据集重新训练用户等级预测模型,及使用新的测试数据集重新测试用户等级预测模型的测试通过率,直至测试通过率大于预设通过率阈值,通知用户等级预测模型的训练。The test pass rate is calculated according to the predicted level label and user level label. When the test pass rate is less than the preset pass rate threshold, the data set is re-divided into a new training data set and a new test data set, and the new training is used The data set retrains the user level prediction model, and uses the new test data set to retest the test pass rate of the user level prediction model until the test pass rate is greater than the preset pass rate threshold, and the user level prediction model training is notified.
S16,使用所述用户等级预测模型对目标用户进行等级预测。S16: Use the user level prediction model to predict the level of the target user.
为满足用户等级预测模型在训练阶段和预测阶段对于入参要求的一致性,从目标用户当月的多个数据指标中获取与入模数据指标对应的目标数据指标,将目标数据指标的特征值输出用户等级预测模型中进行预测得到概率值。所述概率值用以表示目标用户在当月的下一月能够提升为第一等级的可能性。In order to meet the consistency of the user level prediction model in the training phase and the prediction phase for the input requirements, the target data index corresponding to the model data index is obtained from the multiple data indicators of the target user in the current month, and the characteristic value of the target data index is output The probability value is obtained by prediction in the user level prediction model. The probability value is used to indicate the possibility that the target user can be promoted to the first level in the next month of the current month.
在一个可选的实施例中,所述使用所述用户等级预测模型对目标用户进行等级预测包括:In an optional embodiment, the using the user level prediction model to perform level prediction on the target user includes:
计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值;Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;
获取所述目标用户的预测指标并输入所述预测指标至所述用户等级预测模型中进行预测得到预测概率;Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;
比较所述预测概率与所述目标概率阈值;Comparing the predicted probability with the target probability threshold;
当所述预测概率大于所述目标概率阈值时,确定所述目标用户为第一等级;When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;
当所述预测概率小于或者等于所述目标概率阈值时,确定所述目标用户为第二等级。When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
其中,目标概率阈值为根据用户等级预测模型的召回率计算得到的。在预测阶段,用户等级预测模型输出的预测概率越大,目标用户能够提升为第一等级的可能性就越大(例如能够升钻的可能性就越大);用户等级预测模型输出的预测概率越小,目标用户不能够提升为第一等级的可能性就越大(例如,能够升钻的可能性就越小)。Among them, the target probability threshold is calculated according to the recall rate of the user level prediction model. In the prediction stage, the higher the predicted probability output by the user level prediction model, the greater the probability that the target user can be promoted to the first level (for example, the greater the probability of being able to be promoted); the predicted probability output by the user level prediction model The smaller the value, the greater the probability that the target user cannot be promoted to the first level (for example, the less likely it is that the target user can be promoted).
在一个可选的实施例中,所述计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值包括:In an optional embodiment, the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate includes:
采用差分法定义多个候选概率阈值;Use the difference method to define multiple candidate probability thresholds;
针对每个候选概率阈值,根据所述用户等级预测模型输出的预测等级标签及对应的用户等级标签计算召回率;For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
将最大召回率对应的候选概率阈值确定为目标概率阈值。The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
召回率是指正确预测为正的占全部实际为正的比例。采用如下计算公式计算召回率:Recall=TP/(TP+FN),其中,TP为正类被预测为正类,FN为正类被预测为负类。召回率越高,表明正确预测为正的比例越大,召回率越低,表明正确预测为正的比例越小。The recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive. The recall rate is calculated using the following calculation formula: Recall=TP/(TP+FN), where TP is a positive class and is predicted as a positive class, and FN is a positive class and is predicted as a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.
示例性,假设定义的多个候选概率阈值为0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,则确定目标概率阈值的过程如下:Exemplarily, assuming that the defined multiple candidate probability thresholds are 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the process of determining the target probability threshold is as follows:
首先,针对候选概率阈值0.2,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.2进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.2,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值0.2,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第一召回率;First, for the candidate probability threshold of 0.2, compare the predicted probability value output by the user level prediction model in the training phase with the subsequent probability threshold 0.2. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold 0.2, then the predicted level label is obtained. It is the first level, and the predicted probability value output in the training phase is less than the subsequent probability threshold 0.2, then the predicted level label is obtained as the second level, and the first recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
接着针对候选概率阈值0.3,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.3进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.3,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值 0.3,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第二召回率;Then for the candidate probability threshold of 0.3, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.3. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.3, and the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold of 0.3, then the predicted level label is obtained as the second level, and the second recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
以此类推;And so on;
接着针对候选概率阈值0.9,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.9进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.9,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值0.9,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第九召回率。Then for the candidate probability threshold of 0.9, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.9. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.9, then the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold 0.9, then the predicted level label is obtained as the second level, and the ninth recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label.
最后将这九个召回率进行比较,选取最大的召回率对应的候选概率阈值为目标概率阈值。Finally, the nine recall rates are compared, and the candidate probability threshold corresponding to the largest recall rate is selected as the target probability threshold.
通过用户等级预测模型的最大召回率来确定目标概率阈值,更加能够正确的预测出用户的等级。The target probability threshold is determined by the maximum recall rate of the user level prediction model, which can more accurately predict the user level.
需要强调的是,为进一步保证上述用户等级预测模型的私密性和安全性,上述用户等级预测模型可存储于区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned user-level prediction model, the above-mentioned user-level prediction model can be stored in a node of the blockchain.
本申请所述的基于人工智能的用户等级预测方法,可应用智慧政务中,推动智慧城市的发展。本申请在获取多个用户的多个数据指标及获取所述多个用户的用户等级标签之后,计算每个数据指标的饱和度及相关度,并根据饱和度和相关度从多个数据指标中提取出多个入模数据指标,由于多个入模数据指标的数据量远小于多个数据指标,使得基于多个入模数据指标训练用户等级预测模型的效率得以提高;通过将多个入模数据指标及对应的用户等级标签输入至预设神经网络框架中的第一层输入层,获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练,通过分组减少了每层全连接层中参与传递的节点的数量,减少了神经网络的计算量,进一步提高了训练用户等级预测模型的效率;最后获取预设神经网络框架的最后一层输出层输出的预测等级标签,根据预测等级标签迭代训练预设神经网络框架得到用户等级预测模型;使用用户等级预测模型对目标用户进行等级预测,能够提高用户等级预测的准确率。The artificial intelligence-based user level prediction method described in this application can be applied to smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules. Group and determine the target node in each group, and use multiple target nodes of the current fully connected layer to perform fully connected training on the next fully connected layer of the current layer until the final fully connected layer is completed. The fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework The prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.
图2是本申请实施例二提供的基于人工智能的用户等级预测装置的结构图。Fig. 2 is a structural diagram of an artificial intelligence-based user level prediction device provided in the second embodiment of the present application.
在一些实施例中,所述基于人工智能的用户等级预测装置20可以包括多个由计算机可读指令段所组成的功能模块。所述基于人工智能的用户等级预测装置20中的各个程序段的计算机可读指令可以存储于终端的存储器中,并由至少一个处理器所执行,以执行(详见图1描述)基于人工智能的用户等级预测的功能。In some embodiments, the artificial intelligence-based user level prediction device 20 may include multiple functional modules composed of computer-readable instruction segments. The computer-readable instructions of each program segment in the artificial intelligence-based user level prediction device 20 may be stored in the memory of the terminal and executed by at least one processor to execute (see FIG. 1 for details). The function of user level prediction.
本实施例中,所述基于人工智能的用户等级预测装置20根据其所执行的功能,可以被划分为多个功能模块。所述功能模块可以包括:计算模块201、提取模块202、输入模块203、分组模块204、训练模块205及预测模块206。本申请所称的模块是指一种能够被至少一个处理器所执行并且能够完成固定功能的一系列计算机可读指令段,其存储在存储器中。在本实施例中,关于各模块的功能将在后续的实施例中详述。In this embodiment, the artificial intelligence-based user level prediction device 20 can be divided into multiple functional modules according to the functions it performs. The functional modules may include: a calculation module 201, an extraction module 202, an input module 203, a grouping module 204, a training module 205, and a prediction module 206. The module referred to in this application refers to a series of computer-readable instruction segments that can be executed by at least one processor and can complete fixed functions, and are stored in a memory. In this embodiment, the function of each module will be described in detail in subsequent embodiments.
所述计算模块201,用于获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度。The calculation module 201 is configured to obtain multiple data indicators of multiple users and obtain user level labels of the multiple users, and calculate the saturation of each data indicator and the correlation degree of each data indicator.
其中,所述多个数据指标可以包括,但不限于:年龄、性别、收入业绩、埋点行为等。每个用户都对应多个数据指标及用户等级标签。Wherein, the multiple data indicators may include, but are not limited to: age, gender, income performance, burying behavior, etc. Each user corresponds to multiple data indicators and user level labels.
其中,所述用户可以是保险代理人或者公司销售人员等,所述用户等级标签可以包括:第一等级和第二等级。其中,所述第一等级高于所述第二等级。示例性的,第一等级为钻石级,第二等级为非钻石级。Wherein, the user may be an insurance agent or a company salesperson, etc., and the user level label may include: a first level and a second level. Wherein, the first level is higher than the second level. Exemplarily, the first grade is a diamond grade, and the second grade is a non-diamond grade.
由于每个用户的数据指标可能多达上千甚至上万个,使用如此多个数据指标训练用户等级预测模型,会导致用户等级预测模型的训练时间较长,因此,通过计算每个数据指标的饱和度及计算每个数据指标的相关度,来筛选出适合入模的数据指标,以此降低用户等级预测模型的训练时间,提高用户等级预测模型的训练效率,从而提高用户等级预测的效率。Since there may be thousands or even tens of thousands of data indicators for each user, using such multiple data indicators to train the user level prediction model will lead to a longer training time for the user level prediction model. Therefore, by calculating the value of each data indicator Saturation and calculation of the correlation of each data indicator to filter out the data indicators suitable for entering the model, thereby reducing the training time of the user level prediction model, improving the training efficiency of the user level prediction model, and thereby improving the efficiency of user level prediction.
在一个可选的实施例中,所述计算模块201计算每个数据指标的饱和度包括:In an optional embodiment, the calculation module 201 calculating the saturation of each data indicator includes:
遍历每个数据指标的多个特征值;Traverse multiple characteristic values of each data indicator;
计算所述多个特征值中与预设特征值匹配的特征值的第一数量,根据所述第一数量计算所述数据指标的缺失率;Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;
计算所述多个特征值中具有相同特征值的第二数量,根据所述第二数量计算所述数据指标的重复率;Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;
根据所述缺失率和所述重复率计算所述数据指标的饱和度。The saturation of the data index is calculated according to the missing rate and the repetition rate.
不同的用户的同一个数据指标的特征值可能不同,也可能相同。例如,性别数据指标,部分用户的性别数据指标的特征值为女,部分用户的性别数据指标的特征值为男。又如,年龄数据指标,用户的年龄数据指标的特征值可以分布在18-60岁之间。The characteristic value of the same data indicator for different users may be different or the same. For example, for the gender data indicator, the feature value of the gender data indicator of some users is female, and the feature value of the gender data indicator of some users is male. For another example, the age data index, the characteristic value of the user's age data index can be distributed between 18-60 years old.
通常而言,有多少用户,同一个数据指标的特征值的数量应该为多少。然而,实际应用中,采集用户的数据时会存在缺失或者遗漏现象,导致某些用户的某些数据指标的特征值为空。计算某个数据指标中空的特征值的第一数量与用户的数量的比值能够确定该数据指标的缺失率。计算某个数据指标中相同的特征值的第二数量与用户的数量的比值能够确定该数据指标的重复率。Generally speaking, how many users are there and how many feature values of the same data indicator should be. However, in practical applications, there may be missing or omissions when collecting user data, resulting in empty characteristic values of certain data indicators for some users. Calculating the ratio of the first number of empty feature values of a certain data indicator to the number of users can determine the missing rate of the data indicator. Calculating the ratio of the second number of the same feature value to the number of users in a certain data indicator can determine the repetition rate of the data indicator.
在计算出数据指标的缺失率和重复率之后,再计算缺失率与预设第一权重之间的第一乘积,计算重复率与预设第二权重之间的第二乘积;最后计算第一乘积与第二乘积之间的总和得到数据指标的饱和度。其中,预设第一权重与预设第二权重之和为1,且预设第一权重小于预设第二权重。After calculating the missing rate and repetition rate of the data indicators, calculate the first product between the missing rate and the preset first weight, and calculate the second product between the repetition rate and the preset second weight; finally calculate the first product The sum between the product and the second product gets the saturation of the data index. Wherein, the sum of the preset first weight and the preset second weight is 1, and the preset first weight is less than the preset second weight.
在一个可选实施例中,所述计算模块201计算每个数据指标的相关度包括:In an optional embodiment, the calculation module 201 calculating the correlation degree of each data indicator includes:
根据每个数据指标的多个特征值生成特征值向量;Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;
根据所述多个历史用户的用户等级标签生成等级标签向量;Generating a level label vector according to the user level labels of the multiple historical users;
计算所述特征值向量及所述等级标签向量之间的皮尔逊系数;Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;
确定所述皮尔逊系数为所述数据指标的相关度。The Pearson coefficient is determined as the correlation degree of the data index.
通过计算数据指标的特征值向量与等级标签向量之间的皮尔逊系数,来表示数据指标与第一等级是否相关联,便于根据皮尔逊系数选取适合入模的数据指标。By calculating the Pearson coefficient between the eigenvalue vector of the data index and the grade label vector, it is indicated whether the data index is related to the first level, and it is convenient to select the data index suitable for the model according to the Pearson coefficient.
所述提取模块202,用于根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标。The extracting module 202 is configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation.
在一个可选的实施例中,所述提取模块202根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标包括:In an optional embodiment, the extracting module 202 extracting multiple data indicators for modeling from the multiple data indicators of each user according to the saturation and the correlation includes:
从所述多个数据指标中获取大于预设饱和度阈值的饱和度对应的多个第一数据指标;Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;
从所述多个第一数据指标中获取小于预设相关度阈值的相关度对应的多个第二数据指标;Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;
根据所述多个第二数据指标衍生出多个高阶数据指标;Derive multiple high-level data indicators according to the multiple second data indicators;
将所述多个第二数据指标及所述多个高阶数据指标作为所述入模数据指标。The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
对于缺失率过大的数据指标,导致神经网络无法学习到对应的数据指标的特征。因此,去掉小于或者等于预设饱和度阈值的饱和度对应的多个数据指标,保留大于预设饱和度阈值的饱和度对应的多个数据指标,不仅减少了用户等级预测模型的入模的数据指 标的数量,且去掉无用的数据指标,减少噪声数据,有助于提高用户等级预测模型的学习效果。For data indicators with a large missing rate, the neural network cannot learn the characteristics of the corresponding data indicators. Therefore, removing multiple data indicators corresponding to saturations less than or equal to the preset saturation threshold and retaining multiple data indicators corresponding to saturations greater than the preset saturation threshold not only reduces the data entered into the model of the user level prediction model The number of indicators, and the removal of useless data indicators to reduce noise data will help improve the learning effect of the user level prediction model.
假如,年龄数据指标对应的特征值全为20,或者性别数据指标对应的特征值全为女或者全为男,则年龄数据指标或者性别数据指标对于用户等级预测模型无任何学习的意义。因此,去掉大于或者等于预设相关度阈值的相关度对应的多个数据指标,保留小于预设饱和度阈值的饱和度对应的多个数据指标,不仅能够进一步减少用户等级预测模型的入模的数据指标的数量,且能够进一步减少噪声数据,有助于提高用户等级预测模型的学习效果。If the feature values corresponding to the age data indicators are all 20, or the feature values corresponding to the gender data indicators are all female or all male, then the age data indicators or gender data indicators have no learning significance for the user level prediction model. Therefore, removing multiple data indicators corresponding to correlations greater than or equal to the preset correlation threshold, and retaining multiple data indicators corresponding to saturations less than the preset saturation threshold can not only further reduce the risk of user level prediction models. The number of data indicators can further reduce the noise data, which helps to improve the learning effect of the user level prediction model.
所述输入模块203,用于输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层。The input module 203 is configured to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework.
可以获取卷积神经网络作为预设神经网络框架。其中,卷积神经网络包括第一层输入层,多个全连接层,最后一层输出层。所述第一层输入层连接第一层全连接层,第一层全连接层连接第二层全连接层,第二层全连接层连接第三层全连接层,以此类推,最后一层全连接层连接最后一层输出层。全连接层的数量可以依据实际情况设置。A convolutional neural network can be acquired as a preset neural network framework. Among them, the convolutional neural network includes the first input layer, multiple fully connected layers, and the last output layer. The first input layer is connected to the first fully connected layer, the first fully connected layer is connected to the second fully connected layer, the second fully connected layer is connected to the third fully connected layer, and so on, the last layer The fully connected layer connects the last output layer. The number of fully connected layers can be set according to actual conditions.
将每个用户的入模数据指标的特征值及对应的用户等级标签作为数据对,则根据多个用户的数据对能够构建数据集,将所述数据集按照用户的入司时间进行切分为训练数据集和测试数据集。输入训练数据集至预设神经网络框架中的第一层输入层中进行学习与训练。Taking the feature value of each user’s entry data indicator and the corresponding user level label as a data pair, a data set can be constructed based on the data pairs of multiple users, and the data set is divided into divisions according to the user’s entry time Training data set and test data set. Input the training data set to the first input layer in the preset neural network framework for learning and training.
所述分组模块204,用于获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练。The grouping module 204 is configured to obtain all nodes of the current fully connected layer, group all nodes of the current fully connected layer according to preset grouping rules, determine the target node in each group, and use the current The multiple target nodes of the fully connected layer perform fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed.
同一分组内的节点的数量可以根据用户等级预测模型所要达到的运算速度进行选择,其中用户等级预测模型的运算速度越快,则同一分组内的节点的数量越多,用户等级预测模型的运算速度越慢,则在同一分组内的节点的数量越少。针对目标节点,与预设权重做运算,并且往下一层全连接层传递;针对非目标节点,则不与预设权重做运算,并且不往下一层全连接层传递。The number of nodes in the same group can be selected according to the calculation speed of the user level prediction model. The faster the calculation speed of the user level prediction model, the greater the number of nodes in the same group, and the calculation speed of the user level prediction model The slower, the smaller the number of nodes in the same group. For the target node, the calculation is performed with the preset weight and passed to the next fully connected layer; for non-target nodes, the calculation is not performed with the preset weight, and the calculation is not passed to the next fully connected layer.
但为了避免过度分组,导致等级预测模型的预测精度降低,需要逐步试探确定同一分组内的节点的数量,以提高用户等级预测模型的训练效率并提高等级预测模型的预测精度。However, in order to avoid excessive grouping, resulting in a decrease in the prediction accuracy of the grade prediction model, it is necessary to gradually determine the number of nodes in the same group to improve the training efficiency of the user grade prediction model and improve the prediction accuracy of the grade prediction model.
在一个可选的实施例中,所述分组模块204获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点包括:In an optional embodiment, the grouping module 204 obtains all nodes of the current fully connected layer, groups all nodes of the current fully connected layer according to preset grouping rules, and determines the target in each group The nodes include:
获取当前层全连接层中的所有节点的第一节点值;Get the first node value of all nodes in the current fully connected layer;
采用逐步试探分组方法对所述当前层全连接层中的所述所有节点进行分组,并将每次试探的过程中每个分组中的最大第一节点值对应的节点确定为目标节点;Grouping all the nodes in the current fully connected layer by using a step-by-step trial grouping method, and determining the node corresponding to the largest first node value in each group in the process of each trial as the target node;
获取经过全连接训练的下一层全连接层中的所有节点的第二节点值;Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;
针对每次试探分组,计算所述第一节点值与所述第二节点值之间的损失熵;For each trial group, calculate the loss entropy between the first node value and the second node value;
确定最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所述所有节点进行分组。The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all the nodes in the current fully connected layer.
可以根据每层全连接层所有节点的位置进行逐步试探分组。先采用逐步试探分组方法对第一层全连接层进行分组,再采用逐步试探分组方法对第二层全连接层进行分组,以此类推,最后采用逐步试探分组方法对最后一层全连接层进行分组。Step by step test grouping can be carried out according to the positions of all nodes in each fully connected layer. First use the stepwise trial grouping method to group the first fully connected layer, then use the stepwise trial grouping method to group the second layer fully connected layer, and so on, and finally use the stepwise trial grouping method to group the last fully connected layer Grouping.
采用逐步试探分组方法对每一层全连接层进行分组的过程如下:The process of grouping each fully connected layer with the step-by-step trial grouping method is as follows:
第一次试探分组,将紧邻在一起的两个节点分为同一个分组;The first trial grouping is to divide the two nodes that are next to each other into the same group;
第二次试探分组,将紧邻在一起的三个节点分为同一个分组;The second trial grouping is to divide the three nodes that are next to each other into the same group;
……;...
第N次试探分组,将紧邻在一起的N+1个节点分为同一个分组。In the Nth trial group, the N+1 nodes that are next to each other are divided into the same group.
每次试探分组后,计算当前层全连接层的第一节点值与当前层全连接层的下一层全连接层的第二节点值之间的距离得到损失熵。损失熵越大,表明分组后的所有节点在传递给下一层全连接层之后,导致特征损失较多,损失熵越小,表明分组后的所有节点在传递给下一层全连接层之后,导致特征损失较少。After each trial grouping, the distance between the first node value of the current fully connected layer and the second node value of the next fully connected layer of the current fully connected layer is calculated to obtain the loss entropy. The larger the loss entropy, it means that all the nodes after the grouping are passed to the next fully connected layer, resulting in more feature loss, and the smaller the loss entropy is, indicating that all the nodes after the grouping are passed to the next fully connected layer. This results in less feature loss.
选取最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所述所有节点进行分组,能够有效的保证分组后的所有节点在传递给下一层全连接层之后,特征不会损失;且参与传递的目标节点的数量相较于未分组前的节点的数量大大降低,能够减少每层全连接层的运算量,提高整个用户等级预测模型的训练速度和效率。Select the heuristic grouping method corresponding to the smallest loss entropy as the target heuristic grouping method, and use the target heuristic grouping method to group all the nodes in the current fully connected layer, which can effectively guarantee all the nodes after the grouping After being passed to the next fully connected layer, the features will not be lost; and the number of target nodes participating in the transfer is greatly reduced compared with the number of nodes before grouping, which can reduce the amount of calculations in each fully connected layer and improve the entire The training speed and efficiency of the user level prediction model.
所述训练模块205,用于获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型。The training module 205 is configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model.
根据预测等级标签与用户等级标签计算得到测试通过率,当测试通过率小于预设通过率阈值时,将数据集重新切分为新的训练数据集和新的测试数据集,并使用新的训练数据集重新训练用户等级预测模型,及使用新的测试数据集重新测试用户等级预测模型的测试通过率,直至测试通过率大于预设通过率阈值,通知用户等级预测模型的训练。The test pass rate is calculated according to the predicted level label and user level label. When the test pass rate is less than the preset pass rate threshold, the data set is re-divided into a new training data set and a new test data set, and the new training is used The data set retrains the user level prediction model, and uses the new test data set to retest the test pass rate of the user level prediction model until the test pass rate is greater than the preset pass rate threshold, and the user level prediction model training is notified.
所述预测模块206,用于使用所述用户等级预测模型对目标用户进行等级预测。The prediction module 206 is configured to use the user level prediction model to predict the level of the target user.
为满足用户等级预测模型在训练阶段和预测阶段对于入参要求的一致性,从目标用户当月的多个数据指标中获取与入模数据指标对应的目标数据指标,将目标数据指标的特征值输出用户等级预测模型中进行预测得到概率值。所述概率值用以表示目标用户在当月的下一月能够提升为第一等级的可能性。In order to meet the consistency of the user level prediction model in the training phase and the prediction phase for the input requirements, the target data index corresponding to the model data index is obtained from the multiple data indicators of the target user in the current month, and the characteristic value of the target data index is output The probability value is obtained by prediction in the user level prediction model. The probability value is used to indicate the possibility that the target user can be promoted to the first level in the next month of the current month.
在一个可选的实施例中,所述预测模块206使用所述用户等级预测模型对目标用户进行等级预测包括:In an optional embodiment, the prediction module 206 using the user level prediction model to perform level prediction on the target user includes:
计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值;Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;
获取所述目标用户的预测指标并输入所述预测指标至所述用户等级预测模型中进行预测得到预测概率;Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;
比较所述预测概率与所述目标概率阈值;Comparing the predicted probability with the target probability threshold;
当所述预测概率大于所述目标概率阈值时,确定所述目标用户为第一等级;When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;
当所述预测概率小于或者等于所述目标概率阈值时,确定所述目标用户为第二等级。When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
其中,目标概率阈值为根据用户等级预测模型的召回率计算得到的。在预测阶段,用户等级预测模型输出的预测概率越大,目标用户能够提升为第一等级的可能性就越大(例如能够升钻的可能性就越大);用户等级预测模型输出的预测概率越小,目标用户不能够提升为第一等级的可能性就越大(例如,能够升钻的可能性就越小)。Among them, the target probability threshold is calculated according to the recall rate of the user level prediction model. In the prediction stage, the higher the predicted probability output by the user level prediction model, the greater the probability that the target user can be promoted to the first level (for example, the greater the probability of being able to be promoted); the predicted probability output by the user level prediction model The smaller the value, the greater the probability that the target user cannot be promoted to the first level (for example, the less likely it is that the target user can be promoted).
在一个可选的实施例中,所述计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值包括:In an optional embodiment, the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate includes:
采用差分法定义多个候选概率阈值;Use the difference method to define multiple candidate probability thresholds;
针对每个候选概率阈值,根据所述用户等级预测模型输出的预测等级标签及对应的用户等级标签计算召回率;For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
将最大召回率对应的候选概率阈值确定为目标概率阈值。The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
召回率是指正确预测为正的占全部实际为正的比例。采用如下计算公式计算召回率: Recall=TP/(TP+FN),其中,TP为正类被预测为正类,FN为正类被预测为负类。召回率越高,表明正确预测为正的比例越大,召回率越低,表明正确预测为正的比例越小。The recall rate refers to the proportion of the correct prediction that is positive to all that is actually positive. The recall rate is calculated using the following calculation formula: Recall=TP/(TP+FN), where TP is a positive class and is predicted to be a positive class, and FN is a positive class and is predicted to be a negative class. The higher the recall rate, the greater the proportion of correct predictions that are positive, and the lower the recall rate, the smaller the proportion of correct predictions that are positive.
示例性,假设定义的多个候选概率阈值为0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,则确定目标概率阈值的过程如下:Exemplarily, assuming that the defined multiple candidate probability thresholds are 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, the process of determining the target probability threshold is as follows:
首先,针对候选概率阈值0.2,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.2进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.2,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值0.2,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第一召回率;First, for the candidate probability threshold of 0.2, compare the predicted probability value output by the user level prediction model in the training phase with the subsequent probability threshold 0.2. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold 0.2, then the predicted level label is obtained. It is the first level, and the predicted probability value output in the training phase is less than the subsequent probability threshold 0.2, then the predicted level label is obtained as the second level, and the first recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
接着针对候选概率阈值0.3,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.3进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.3,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值0.3,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第二召回率;Then for the candidate probability threshold of 0.3, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.3. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.3, and the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold of 0.3, then the predicted level label is obtained as the second level, and the second recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label;
以此类推;And so on;
接着针对候选概率阈值0.9,将用户等级预测模型在训练阶段输出的预测概率值与后续概率阈值0.9进行比较,在训练阶段输出的预测概率值大于或者等于后续概率阈值0.9,则得到预测等级标签为第一等级,在训练阶段输出的预测概率值小于后续概率阈值0.9,则得到预测等级标签为第二等级,根据训练阶段的预测等级标签及对应的用户等级标签计算得到第九召回率。Then for the candidate probability threshold of 0.9, the predicted probability value output by the user level prediction model in the training phase is compared with the subsequent probability threshold value of 0.9. The predicted probability value output in the training phase is greater than or equal to the subsequent probability threshold value of 0.9, then the predicted level label is For the first level, the predicted probability value output in the training phase is less than the subsequent probability threshold 0.9, then the predicted level label is obtained as the second level, and the ninth recall rate is calculated according to the predicted level label in the training phase and the corresponding user level label.
最后将这九个召回率进行比较,选取最大的召回率对应的候选概率阈值为目标概率阈值。Finally, the nine recall rates are compared, and the candidate probability threshold corresponding to the largest recall rate is selected as the target probability threshold.
通过用户等级预测模型的最大召回率来确定目标概率阈值,更加能够正确的预测出用户的等级。The target probability threshold is determined by the maximum recall rate of the user level prediction model, which can more accurately predict the user level.
需要强调的是,为进一步保证上述用户等级预测模型的私密性和安全性,上述用户等级预测模型可存储于区块链的节点中。It should be emphasized that, in order to further ensure the privacy and security of the above-mentioned user-level prediction model, the above-mentioned user-level prediction model can be stored in a node of the blockchain.
本申请所述的基于人工智能的用户等级预测装置,可应用智慧政务中,推动智慧城市的发展。本申请在获取多个用户的多个数据指标及获取所述多个用户的用户等级标签之后,计算每个数据指标的饱和度及相关度,并根据饱和度和相关度从多个数据指标中提取出多个入模数据指标,由于多个入模数据指标的数据量远小于多个数据指标,使得基于多个入模数据指标训练用户等级预测模型的效率得以提高;通过将多个入模数据指标及对应的用户等级标签输入至预设神经网络框架中的第一层输入层,获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练,通过分组减少了每层全连接层中参与传递的节点的数量,减少了神经网络的计算量,进一步提高了训练用户等级预测模型的效率;最后获取预设神经网络框架的最后一层输出层输出的预测等级标签,根据预测等级标签迭代训练预设神经网络框架得到用户等级预测模型;使用用户等级预测模型对目标用户进行等级预测,能够提高用户等级预测的准确率。The artificial intelligence-based user level prediction device described in this application can be used in smart government affairs to promote the development of smart cities. After obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, this application calculates the saturation and relevance of each data indicator, and calculates the saturation and relevance from the multiple data indicators according to the saturation and relevance. Extract multiple entry data indicators. Since the data volume of multiple entry data indicators is much smaller than multiple data indicators, the efficiency of training user level prediction models based on multiple entry data indicators can be improved; by combining multiple entry data indicators Data indicators and corresponding user level labels are input to the first input layer in the preset neural network framework, all nodes of the current fully connected layer are obtained, and all nodes of the current fully connected layer are performed according to the preset grouping rules. Group and determine the target node in each group, and use multiple target nodes of the current fully connected layer to perform fully connected training on the next fully connected layer of the current layer until the final fully connected layer is completed. The fully connected training of the connected layer reduces the number of nodes participating in the transfer in each fully connected layer through grouping, reduces the calculation amount of the neural network, and further improves the efficiency of training the user level prediction model; finally obtain the preset neural network framework The prediction level label output by the last layer of output layer is iteratively trained according to the prediction level label to obtain the user level prediction model by using a preset neural network framework; using the user level prediction model to predict the target user level can improve the accuracy of user level prediction.
参阅图3所示,为本申请实施例三提供的终端的结构示意图。在本申请较佳实施例中,所述终端3包括存储器31、至少一个处理器32、至少一条通信总线33及收发器34。Refer to FIG. 3, which is a schematic structural diagram of a terminal provided in Embodiment 3 of this application. In a preferred embodiment of the present application, the terminal 3 includes a memory 31, at least one processor 32, at least one communication bus 33, and a transceiver 34.
本领域技术人员应该了解,图3示出的终端的结构并不构成本申请实施例的限定,既可以是总线型结构,也可以是星形结构,所述终端3还可以包括比图示更多或更少的 其他硬件或者软件,或者不同的部件布置。Those skilled in the art should understand that the structure of the terminal shown in FIG. 3 does not constitute a limitation of the embodiments of the present application. It may be a bus-type structure or a star structure. The terminal 3 may also include more than that shown in the figure. More or less other hardware or software, or different component arrangements.
在一些实施例中,所述终端3是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的终端,其硬件包括但不限于微处理器、专用集成电路、可编程门阵列、数字处理器及嵌入式设备等。所述终端3还可包括客户设备,所述客户设备包括但不限于任何一种可与客户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互的电子产品,例如,个人计算机、平板电脑、智能手机、数码相机等。In some embodiments, the terminal 3 is a terminal that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, an application specific integrated circuit, and Programming gate arrays, digital processors and embedded devices, etc. The terminal 3 may also include client equipment. The client equipment includes, but is not limited to, any electronic product that can interact with the client through a keyboard, a mouse, a remote control, a touch panel, or a voice control device, for example, a personal computer. Computers, tablets, smart phones, digital cameras, etc.
需要说明的是,所述终端3仅为举例,其他现有的或今后可能出现的电子产品如可适应于本申请,也应包含在本申请的保护范围以内,并以引用方式包含于此。It should be noted that the terminal 3 is only an example. If other existing or future electronic products can be adapted to this application, they should also be included in the protection scope of this application and included here by reference.
在一些实施例中,所述存储器31中存储有计算机可读指令,所述计算机可读指令被所述至少一个处理器32执行时实现如所述的基于人工智能的用户等级预测方法中的全部或者部分步骤。所述存储器31包括易失性和非易失性存储器,例如随机存取存储器(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable Read-Only Memory,PROM)、可擦除可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM)、一次可编程只读存储器(One-time Programmable Read-Only Memory,OTPROM)、电子擦除式可复写只读存储器(Electrically-Erasable Programmable Read-Only Memory,EEPROM)、只读光盘(Compact Disc Read-Only Memory,CD-ROM)或其他光盘存储器、磁盘存储器、磁带存储器、或者能够用于携带或存储数据的计算机可读的存储介质。所述计算机可读存储介质可以是非易失性,也可以是易失性的。In some embodiments, computer-readable instructions are stored in the memory 31, and when the computer-readable instructions are executed by the at least one processor 32, all of the aforementioned artificial intelligence-based user level prediction methods are implemented. Or part of the steps. The memory 31 includes volatile and non-volatile memory, such as random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), and programmable read-only memory (Programmable Read-Only). Memory, PROM), Erasable Programmable Read-Only Memory (EPROM), One-time Programmable Read-Only Memory (OTPROM), Electronic Erasable Programmable Read-Only Memory, OTPROM Read memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other optical disk storage, magnetic disk storage, tape storage, or capable of carrying or storing data Computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile.
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。Further, the computer-readable storage medium may mainly include a storage program area and a storage data area, where the storage program area may store an operating system, an application program required by at least one function, etc.; the storage data area may store Data created by the use of nodes, etc.
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。The blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
在一些实施例中,所述至少一个处理器32是所述终端3的控制核心(Control Unit),利用各种接口和线路连接整个终端3的各个部件,通过运行或执行存储在所述存储器31内的程序或者模块,以及调用存储在所述存储器31内的数据,以执行终端3的各种功能和处理数据。例如,所述至少一个处理器32执行所述存储器中存储的计算机可读指令时实现本申请实施例中所述的基于人工智能的用户等级预测方法的全部或者部分步骤;或者实现基于人工智能的用户等级预测装置的全部或者部分功能。所述至少一个处理器32可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。In some embodiments, the at least one processor 32 is the control core (Control Unit) of the terminal 3. Various interfaces and lines are used to connect various components of the entire terminal 3, and are stored in the memory 31 through operation or execution. The programs or modules in the terminal 3 and the data stored in the memory 31 are called to execute various functions of the terminal 3 and process data. For example, when the at least one processor 32 executes the computer-readable instructions stored in the memory, all or part of the steps of the artificial intelligence-based user level prediction method described in the embodiments of the present application are implemented; or the artificial intelligence-based All or part of the functions of the user level prediction device. The at least one processor 32 may be composed of integrated circuits, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more central processing units. (Central Processing unit, CPU), a combination of microprocessors, digital processing chips, graphics processors, and various control chips.
在一些实施例中,所述至少一条通信总线33被设置为实现所述存储器31以及所述至少一个处理器32等之间的连接通信。In some embodiments, the at least one communication bus 33 is configured to implement connection and communication between the memory 31 and the at least one processor 32 and the like.
尽管未示出,所述终端3还可以包括给各个部件供电的电源(比如电池),优选的,电源可以通过电源管理装置与所述至少一个处理器32逻辑相连,从而通过电源管理装置实现管理充电、放电、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述终端3还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不 再赘述。Although not shown, the terminal 3 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 32 through a power management device, so as to realize management through the power management device. Functions such as charging, discharging, and power management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The terminal 3 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
上述以软件功能模块的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台终端(可以是个人计算机,终端,或者网络设备等)或处理器(processor)执行本申请各个实施例所述方法的部分。The above-mentioned integrated unit implemented in the form of a software function module may be stored in a computer readable storage medium. The above-mentioned software function module is stored in a storage medium and includes several instructions to make a terminal (which may be a personal computer, a terminal, or a network device, etc.) or a processor execute part of the method described in each embodiment of the present application .
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed device and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,既可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, and may be located in one place or distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附图标记视为限制所涉及的权利要求。此外,显然“包括”一词不排除其他单元或,单数不排除复数。本发明中陈述的多个单元或装置也可以由一个单元或装置通过软件或者硬件来实现。第一,第二等词语用来表示名称,而并不表示任何特定的顺序。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application. Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any reference signs in the claims should not be regarded as limiting the claims involved. In addition, it is obvious that the word "including" does not exclude other elements or the singular number does not exclude the plural number. Multiple units or devices stated in the present invention can also be implemented by one unit or device through software or hardware. Words such as first and second are used to denote names, but do not denote any specific order.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种基于人工智能的用户等级预测方法,其中,所述方法包括:An artificial intelligence-based user level prediction method, wherein the method includes:
    获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
    根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
    输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
    获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
    获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
    使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
  2. 如权利要求1所述的基于人工智能的用户等级预测方法,其中,所述获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点包括:The artificial intelligence-based user level prediction method according to claim 1, wherein said acquiring all nodes of the current fully connected layer, grouping all nodes of the current fully connected layer according to preset grouping rules and determining The target nodes in each group include:
    获取当前层全连接层中的所有节点的第一节点值;Get the first node value of all nodes in the current fully connected layer;
    采用逐步试探分组方法对所述当前层全连接层中的所有节点进行分组,并将每次试探的过程中每个分组中的最大第一节点值对应的节点确定为目标节点;Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;
    使用所述目标节点对所述当前层的下一层全连接层进行全连接训练;Use the target node to perform fully connected training on the next fully connected layer of the current layer;
    获取经过全连接训练的下一层全连接层中的所有节点的第二节点值;Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;
    针对每次试探分组,计算所述第一节点值与所述第二节点值之间的损失熵;For each trial group, calculate the loss entropy between the first node value and the second node value;
    确定最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所有节点进行分组。The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
  3. 如权利要求1所述的基于人工智能的用户等级预测方法,其中,所述使用所述用户等级预测模型对目标用户进行等级预测包括:The method for predicting a user level based on artificial intelligence according to claim 1, wherein said using the user level prediction model to predict the level of a target user comprises:
    计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值;Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;
    获取所述目标用户的预测指标并输入所述预测指标至所述用户等级预测模型中进行预测得到预测概率;Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;
    比较所述预测概率与所述目标概率阈值;Comparing the predicted probability with the target probability threshold;
    当所述预测概率大于所述目标概率阈值时,确定所述目标用户为第一等级;When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;
    当所述预测概率小于或者等于所述目标概率阈值时,确定所述目标用户为第二等级。When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
  4. 如权利要求1所述的基于人工智能的用户等级预测方法,其中,所述计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值包括:The artificial intelligence-based user level prediction method according to claim 1, wherein the calculating the recall rate of the user level prediction model and determining the target probability threshold according to the recall rate comprises:
    采用差分法定义多个候选概率阈值;Use the difference method to define multiple candidate probability thresholds;
    针对每个候选概率阈值,根据所述用户等级预测模型输出的预测等级标签及对应的用户等级标签计算召回率;For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
    将最大召回率对应的候选概率阈值确定为目标概率阈值。The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
  5. 如权利要求1至4中任意一项所述的基于人工智能的用户等级预测方法,其中,所述计算每个数据指标的饱和度包括:The artificial intelligence-based user level prediction method according to any one of claims 1 to 4, wherein said calculating the saturation of each data indicator comprises:
    遍历每个数据指标的多个特征值;Traverse multiple characteristic values of each data indicator;
    计算所述多个特征值中与预设特征值匹配的特征值的第一数量,根据所述第一数量计算所述数据指标的缺失率;Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;
    计算所述多个特征值中具有相同特征值的第二数量,根据所述第二数量计算所述数据指标的重复率;Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;
    根据所述缺失率和所述重复率计算所述数据指标的饱和度。The saturation of the data index is calculated according to the missing rate and the repetition rate.
  6. 如权利要求5所述的基于人工智能的用户等级预测方法,其中,所述计算每个数据指标的相关度包括:The method for predicting user level based on artificial intelligence according to claim 5, wherein said calculating the relevance of each data indicator comprises:
    根据每个数据指标的多个特征值生成特征值向量;Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;
    根据所述多个历史用户的用户等级标签生成等级标签向量;Generating a level label vector according to the user level labels of the multiple historical users;
    计算所述特征值向量及所述等级标签向量之间的皮尔逊系数;Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;
    确定所述皮尔逊系数为所述数据指标的相关度。The Pearson coefficient is determined as the correlation degree of the data index.
  7. 如权利要求6所述的基于人工智能的用户等级预测方法,其中,所述根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标包括:The artificial intelligence-based user level prediction method according to claim 6, wherein the multiple data indicators of each user are extracted from the multiple data indicators of each user according to the saturation and the correlation. include:
    从所述多个数据指标中获取大于预设饱和度阈值的饱和度对应的多个第一数据指标;Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;
    从所述多个第一数据指标中获取小于预设相关度阈值的相关度对应的多个第二数据指标;Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;
    根据所述多个第二数据指标衍生出多个高阶数据指标;Derive multiple high-level data indicators according to the multiple second data indicators;
    将所述多个第二数据指标及所述多个高阶数据指标作为所述入模数据指标。The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
  8. 一种基于人工智能的用户等级预测装置,其中,所述装置包括:An artificial intelligence-based user level prediction device, wherein the device includes:
    计算模块,用于获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;A calculation module for obtaining multiple data indicators of multiple users and obtaining user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
    提取模块,用于根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;An extracting module, configured to extract multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
    输入模块,用于输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;The input module is used to input multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, wherein the preset neural network framework also includes multiple Fully connected layer and the last output layer;
    分组模块,用于获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;The grouping module is used to obtain all the nodes of the current fully connected layer, group all the nodes of the current fully connected layer according to preset grouping rules and determine the target node in each grouping, and use the current layer fully connected The multiple target nodes of the layer perform fully-connected training on the next fully-connected layer of the current layer until the fully-connected training of the last fully-connected layer is completed;
    训练模块,用于获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;A training module, configured to obtain a prediction level label output by the last output layer of the preset neural network framework, and iteratively train the preset neural network framework according to the prediction level label to obtain a user level prediction model;
    预测模块,用于使用所述用户等级预测模型对目标用户进行等级预测。The prediction module is configured to use the user level prediction model to predict the level of the target user.
  9. 一种终端,其中,所述终端包括处理器,所述处理器用于执行存储器中存储的计算机可读指令时实现以下步骤:A terminal, wherein the terminal includes a processor, and the processor is configured to implement the following steps when executing computer-readable instructions stored in a memory:
    获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
    根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
    输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
    获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有 节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
    获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
    使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
  10. 如权利要求9所述的终端,其中,所述处理器执行所述计算机可读指令以实现获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点时,具体包括:The terminal according to claim 9, wherein the processor executes the computer-readable instructions to obtain all nodes of the current layer fully connected layer, and perform grouping on all nodes of the current layer fully connected layer according to preset grouping rules. When grouping and determining the target node in each grouping, it specifically includes:
    获取当前层全连接层中的所有节点的第一节点值;Get the first node value of all nodes in the current fully connected layer;
    采用逐步试探分组方法对所述当前层全连接层中的所有节点进行分组,并将每次试探的过程中每个分组中的最大第一节点值对应的节点确定为目标节点;Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;
    使用所述目标节点对所述当前层的下一层全连接层进行全连接训练;Use the target node to perform fully connected training on the next fully connected layer of the current layer;
    获取经过全连接训练的下一层全连接层中的所有节点的第二节点值;Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;
    针对每次试探分组,计算所述第一节点值与所述第二节点值之间的损失熵;For each trial group, calculate the loss entropy between the first node value and the second node value;
    确定最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所有节点进行分组。The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
  11. 如权利要求9所述的终端,其中,所述处理器执行所述计算机可读指令以实现使用所述用户等级预测模型对目标用户进行等级预测时,具体包括:9. The terminal according to claim 9, wherein when the processor executes the computer-readable instructions to implement the level prediction of the target user using the user level prediction model, it specifically includes:
    计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值;Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;
    获取所述目标用户的预测指标并输入所述预测指标至所述用户等级预测模型中进行预测得到预测概率;Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;
    比较所述预测概率与所述目标概率阈值;Comparing the predicted probability with the target probability threshold;
    当所述预测概率大于所述目标概率阈值时,确定所述目标用户为第一等级;When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;
    当所述预测概率小于或者等于所述目标概率阈值时,确定所述目标用户为第二等级。When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
  12. 如权利要求9所述的终端,其中,所述处理器执行所述计算机可读指令以实现计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值时,具体包括:The terminal according to claim 9, wherein when the processor executes the computer-readable instructions to calculate the recall rate of the user level prediction model and determines the target probability threshold according to the recall rate, the specific steps include:
    采用差分法定义多个候选概率阈值;Use the difference method to define multiple candidate probability thresholds;
    针对每个候选概率阈值,根据所述用户等级预测模型输出的预测等级标签及对应的用户等级标签计算召回率;For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
    将最大召回率对应的候选概率阈值确定为目标概率阈值。The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
  13. 如权利要求9至12中任意一项所述的终端,其中,所述处理器执行所述计算机可读指令以实现计算每个数据指标的饱和度时,具体包括:The terminal according to any one of claims 9 to 12, wherein when the processor executes the computer-readable instruction to calculate the saturation of each data indicator, it specifically includes:
    遍历每个数据指标的多个特征值;Traverse multiple characteristic values of each data indicator;
    计算所述多个特征值中与预设特征值匹配的特征值的第一数量,根据所述第一数量计算所述数据指标的缺失率;Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;
    计算所述多个特征值中具有相同特征值的第二数量,根据所述第二数量计算所述数据指标的重复率;Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;
    根据所述缺失率和所述重复率计算所述数据指标的饱和度。The saturation of the data index is calculated according to the missing rate and the repetition rate.
  14. 如权利要求13所述的终端,其中,所述处理器执行所述计算机可读指令以实现计算每个数据指标的相关度时,具体包括:The terminal according to claim 13, wherein, when the processor executes the computer-readable instruction to calculate the correlation degree of each data indicator, it specifically includes:
    根据每个数据指标的多个特征值生成特征值向量;Generate eigenvalue vectors based on multiple eigenvalues of each data indicator;
    根据所述多个历史用户的用户等级标签生成等级标签向量;Generating a level label vector according to the user level labels of the multiple historical users;
    计算所述特征值向量及所述等级标签向量之间的皮尔逊系数;Calculating the Pearson coefficient between the eigenvalue vector and the rank label vector;
    确定所述皮尔逊系数为所述数据指标的相关度。The Pearson coefficient is determined as the correlation degree of the data index.
  15. 如权利要求14所述的终端,其中,所述处理器执行所述计算机可读指令以实现根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标时,具体包括:The terminal according to claim 14, wherein the processor executes the computer-readable instructions to extract multiple data indicators from the multiple data indicators of each user according to the saturation and the correlation. When entering the model data indicators, specifically include:
    从所述多个数据指标中获取大于预设饱和度阈值的饱和度对应的多个第一数据指标;Acquiring, from the multiple data indicators, multiple first data indicators corresponding to saturations greater than a preset saturation threshold;
    从所述多个第一数据指标中获取小于预设相关度阈值的相关度对应的多个第二数据指标;Acquiring, from the plurality of first data indicators, a plurality of second data indicators corresponding to a correlation degree less than a preset correlation degree threshold;
    根据所述多个第二数据指标衍生出多个高阶数据指标;Derive multiple high-level data indicators according to the multiple second data indicators;
    将所述多个第二数据指标及所述多个高阶数据指标作为所述入模数据指标。The multiple second data indicators and the multiple high-level data indicators are used as the model entry data indicators.
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现以下步骤:A computer-readable storage medium having computer-readable instructions stored thereon, wherein the computer-readable instructions implement the following steps when executed by a processor:
    获取多个用户的多个数据指标及获取所述多个用户的用户等级标签,并计算每个数据指标的饱和度及计算每个数据指标的相关度;Acquiring multiple data indicators of multiple users and acquiring user level labels of the multiple users, and calculating the saturation of each data indicator and calculating the correlation degree of each data indicator;
    根据所述饱和度和所述相关度从每个用户的所述多个数据指标中提取出多个入模数据指标;Extracting multiple data indicators into the model from the multiple data indicators of each user according to the saturation and the correlation;
    输入所述多个用户的多个入模数据指标及所述用户等级标签至预设神经网络框架中的第一层输入层,其中,所述预设神经网络框架还包括多层全连接层及最后一层输出层;Input the multiple entry data indicators of the multiple users and the user level label to the first input layer in the preset neural network framework, where the preset neural network framework further includes a multi-layer fully connected layer and The last output layer;
    获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点,使用所述当前层全连接层的多个所述目标节点对所述当前层的下一层全连接层进行全连接训练,直至完成对最后一层全连接层的全连接训练;Obtain all nodes in the current fully connected layer, group all nodes in the current fully connected layer according to preset grouping rules, and determine the target node in each grouping, using multiple all nodes in the current fully connected layer The target node performs fully connected training on the next fully connected layer of the current layer until the fully connected training of the last fully connected layer is completed;
    获取所述预设神经网络框架的最后一层输出层输出的预测等级标签,根据所述预测等级标签迭代训练所述预设神经网络框架得到用户等级预测模型;Acquiring a prediction level label output by the last output layer of the preset neural network framework, and iteratively training the preset neural network framework according to the prediction level label to obtain a user level prediction model;
    使用所述用户等级预测模型对目标用户进行等级预测。The user level prediction model is used to predict the level of the target user.
  17. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现获取当前层全连接层的所有节点,按照预设分组规则对所述当前层全连接层的所有节点进行分组并确定每个分组中的目标节点时,具体包括:The computer-readable storage medium according to claim 16, wherein the computer-readable instructions are executed by the processor to obtain all nodes of the current layer fully connected layer, and perform grouping of all nodes of the current layer according to preset grouping rules. When all the nodes in the connection layer are grouped and the target node in each group is determined, the details include:
    获取当前层全连接层中的所有节点的第一节点值;Get the first node value of all nodes in the current fully connected layer;
    采用逐步试探分组方法对所述当前层全连接层中的所有节点进行分组,并将每次试探的过程中每个分组中的最大第一节点值对应的节点确定为目标节点;Grouping all nodes in the current fully connected layer by using a stepwise trial grouping method, and determining the node corresponding to the largest first node value in each grouping during each trial process as the target node;
    使用所述目标节点对所述当前层的下一层全连接层进行全连接训练;Use the target node to perform fully connected training on the next fully connected layer of the current layer;
    获取经过全连接训练的下一层全连接层中的所有节点的第二节点值;Obtain the second node value of all nodes in the next fully-connected layer after fully-connected training;
    针对每次试探分组,计算所述第一节点值与所述第二节点值之间的损失熵;For each trial group, calculate the loss entropy between the first node value and the second node value;
    确定最小的损失熵对应的试探分组方法为目标试探分组方法,并使用所述目标试探分组方法对所述当前层全连接层中的所有节点进行分组。The heuristic grouping method corresponding to the smallest loss entropy is determined to be the target heuristic grouping method, and the target heuristic grouping method is used to group all nodes in the current fully connected layer.
  18. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现使用所述用户等级预测模型对目标用户进行等级预测时,具体包括:15. The computer-readable storage medium according to claim 16, wherein when the computer-readable instructions are executed by the processor to implement the level prediction of the target user using the user level prediction model, it specifically comprises:
    计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值;Calculating a recall rate of the user level prediction model, and determining a target probability threshold according to the recall rate;
    获取所述目标用户的预测指标并输入所述预测指标至所述用户等级预测模型中进行预测得到预测概率;Acquiring the predictive index of the target user and inputting the predictive index into the user level prediction model for prediction to obtain a prediction probability;
    比较所述预测概率与所述目标概率阈值;Comparing the predicted probability with the target probability threshold;
    当所述预测概率大于所述目标概率阈值时,确定所述目标用户为第一等级;When the predicted probability is greater than the target probability threshold, determining that the target user is at the first level;
    当所述预测概率小于或者等于所述目标概率阈值时,确定所述目标用户为第二等级。When the predicted probability is less than or equal to the target probability threshold, it is determined that the target user is at the second level.
  19. 如权利要求16所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现计算所述用户等级预测模型的召回率并根据所述召回率确定目标概率阈值时,具体包括:The computer-readable storage medium of claim 16, wherein the computer-readable instructions are executed by the processor to calculate the recall rate of the user level prediction model and determine the target probability threshold according to the recall rate. , Specifically including:
    采用差分法定义多个候选概率阈值;Use the difference method to define multiple candidate probability thresholds;
    针对每个候选概率阈值,根据所述用户等级预测模型输出的预测等级标签及对应的用户等级标签计算召回率;For each candidate probability threshold, the recall rate is calculated according to the predicted level label output by the user level prediction model and the corresponding user level label;
    将最大召回率对应的候选概率阈值确定为目标概率阈值。The candidate probability threshold corresponding to the maximum recall rate is determined as the target probability threshold.
  20. 如权利要求16至19中任意一项所述的计算机可读存储介质,其中,所述计算机可读指令被所述处理器执行以实现计算每个数据指标的饱和度时,具体包括:The computer-readable storage medium according to any one of claims 16 to 19, wherein, when the computer-readable instruction is executed by the processor to calculate the saturation of each data indicator, it specifically includes:
    遍历每个数据指标的多个特征值;Traverse multiple characteristic values of each data indicator;
    计算所述多个特征值中与预设特征值匹配的特征值的第一数量,根据所述第一数量计算所述数据指标的缺失率;Calculating a first number of feature values matching a preset feature value among the plurality of feature values, and calculating a missing rate of the data indicator according to the first number;
    计算所述多个特征值中具有相同特征值的第二数量,根据所述第二数量计算所述数据指标的重复率;Calculating a second number of the multiple characteristic values that have the same characteristic value, and calculating a repetition rate of the data indicator according to the second number;
    根据所述缺失率和所述重复率计算所述数据指标的饱和度。The saturation of the data index is calculated according to the missing rate and the repetition rate.
PCT/CN2020/131955 2020-10-13 2020-11-26 Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium WO2021139432A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011092932.5A CN112102011A (en) 2020-10-13 2020-10-13 User grade prediction method, device, terminal and medium based on artificial intelligence
CN202011092932.5 2020-10-13

Publications (2)

Publication Number Publication Date
WO2021139432A1 true WO2021139432A1 (en) 2021-07-15
WO2021139432A9 WO2021139432A9 (en) 2021-09-23

Family

ID=73783614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/131955 WO2021139432A1 (en) 2020-10-13 2020-11-26 Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium

Country Status (2)

Country Link
CN (1) CN112102011A (en)
WO (1) WO2021139432A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723524A (en) * 2021-08-31 2021-11-30 平安国际智慧城市科技股份有限公司 Data processing method based on prediction model, related equipment and medium
CN117112574A (en) * 2023-10-20 2023-11-24 美云智数科技有限公司 Tree service data construction method, device, computer equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112818028B (en) * 2021-01-12 2021-09-17 平安科技(深圳)有限公司 Data index screening method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109711860A (en) * 2018-11-12 2019-05-03 平安科技(深圳)有限公司 Prediction technique and device, storage medium, the computer equipment of user behavior
CN110674716A (en) * 2019-09-16 2020-01-10 腾讯云计算(北京)有限责任公司 Image recognition method, device and storage medium
CN110852785A (en) * 2019-10-12 2020-02-28 中国平安人寿保险股份有限公司 User grading method, device and computer readable storage medium
CN110874758A (en) * 2018-09-03 2020-03-10 北京京东金融科技控股有限公司 Potential customer prediction method, device, system, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110874758A (en) * 2018-09-03 2020-03-10 北京京东金融科技控股有限公司 Potential customer prediction method, device, system, electronic equipment and storage medium
CN109711860A (en) * 2018-11-12 2019-05-03 平安科技(深圳)有限公司 Prediction technique and device, storage medium, the computer equipment of user behavior
CN110674716A (en) * 2019-09-16 2020-01-10 腾讯云计算(北京)有限责任公司 Image recognition method, device and storage medium
CN110852785A (en) * 2019-10-12 2020-02-28 中国平安人寿保险股份有限公司 User grading method, device and computer readable storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113723524A (en) * 2021-08-31 2021-11-30 平安国际智慧城市科技股份有限公司 Data processing method based on prediction model, related equipment and medium
CN113723524B (en) * 2021-08-31 2024-05-17 深圳平安智慧医健科技有限公司 Data processing method based on prediction model, related equipment and medium
CN117112574A (en) * 2023-10-20 2023-11-24 美云智数科技有限公司 Tree service data construction method, device, computer equipment and storage medium
CN117112574B (en) * 2023-10-20 2024-02-23 美云智数科技有限公司 Tree service data construction method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2021139432A9 (en) 2021-09-23
CN112102011A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
WO2021139432A1 (en) Artificial intelligence-based user rating prediction method and apparatus, terminal, and medium
CN112445854B (en) Multi-source service data real-time processing method, device, terminal and storage medium
CN109933647A (en) Determine method, apparatus, electronic equipment and the computer storage medium of description information
US20200082941A1 (en) Care path analysis and management platform
CN112256886B (en) Probability calculation method and device in atlas, computer equipment and storage medium
CN110995459B (en) Abnormal object identification method, device, medium and electronic equipment
CN112862546B (en) User loss prediction method and device, computer equipment and storage medium
CN114997263B (en) Method, device, equipment and storage medium for analyzing training rate based on machine learning
CN114880449B (en) Method and device for generating answers of intelligent questions and answers, electronic equipment and storage medium
WO2023040145A1 (en) Artificial intelligence-based text classification method and apparatus, electronic device, and medium
CN113627160B (en) Text error correction method and device, electronic equipment and storage medium
CN115018656A (en) Risk identification method, and training method, device and equipment of risk identification model
CN112818028B (en) Data index screening method and device, computer equipment and storage medium
Kumar et al. User Story Clustering using K-Means Algorithm in Agile Requirement Engineering
CN112948275A (en) Test data generation method, device, equipment and storage medium
CN111652282B (en) Big data-based user preference analysis method and device and electronic equipment
US9361579B2 (en) Large scale probabilistic ontology reasoning
WO2021115269A1 (en) User cluster prediction method, apparatus, computer device, and storage medium
CN111651652B (en) Emotion tendency identification method, device, equipment and medium based on artificial intelligence
CN112668788A (en) User scoring model training method based on deep learning and related equipment
CN114492446A (en) Legal document processing method and device, electronic equipment and storage medium
CN113780675A (en) Consumption prediction method and device, storage medium and electronic equipment
CN113657546A (en) Information classification method and device, electronic equipment and readable storage medium
CN113157865B (en) Cross-language word vector generation method and device, electronic equipment and storage medium
CN114968412B (en) Configuration file generation method, device, equipment and medium based on artificial intelligence

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20912292

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20912292

Country of ref document: EP

Kind code of ref document: A1