WO2023236588A1 - 基于客群偏差平滑优化的用户分类方法及装置 - Google Patents
基于客群偏差平滑优化的用户分类方法及装置 Download PDFInfo
- Publication number
- WO2023236588A1 WO2023236588A1 PCT/CN2023/077882 CN2023077882W WO2023236588A1 WO 2023236588 A1 WO2023236588 A1 WO 2023236588A1 CN 2023077882 W CN2023077882 W CN 2023077882W WO 2023236588 A1 WO2023236588 A1 WO 2023236588A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- customer group
- sample
- classification
- customer
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000005457 optimization Methods 0.000 title claims abstract description 37
- 238000009499 grossing Methods 0.000 title abstract description 5
- 238000013145 classification model Methods 0.000 claims abstract description 58
- 238000012549 training Methods 0.000 claims abstract description 15
- 238000004590 computer program Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 7
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 239000004020 conductor Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Definitions
- the present disclosure relates to the field of device data processing, and specifically, to a user classification method, device, electronic device and computer-readable medium based on smooth optimization of customer group deviations.
- the Internet service platform can manage a large amount of user device data in a classified manner. For example, first group a large amount of user device data to obtain multiple different customer groups. Then the model is trained based on user device data in different customer groups to obtain multiple customer group models. When new user equipment data is obtained, the user equipment data can be predicted separately through multiple customer group models, and the customer group to which the user belongs is determined based on the prediction results.
- the related technology has at least the following technical problems: the readiness of the results obtained by predicting the customer group of the user through the above method is low, resulting in some users being classified into groups that are inconsistent with the real situation. customer base and reduce user experience.
- the present disclosure provides a user classification method, device, electronic device and computer-readable medium based on customer group deviation smooth optimization, which can determine the customer group to which the user belongs through the customer group model and the user classification model. In this way The obtained classification results are more prepared, making the classification results more consistent with the real situation of the user to be identified, thereby improving the user experience.
- a user classification method based on smooth optimization of customer group deviations includes: obtaining a customer group model obtained by training sample device data of sample users belonging to different customer groups respectively; The sample equipment data is input into each of the customer group models respectively, and the corresponding predicted safety scores are obtained respectively; based on the sample equipment data of the sample users in each customer group, the true safety score of each customer group is determined; according to the data of each sample user in each customer group, The predicted safety scores under different customer group models and the corresponding real safety scores of the customer groups are used to obtain the classification weight of each sample user under different customer groups; according to the classification weight of each sample user under different customers The classification weight under the group and the customer group to which each of the sample users belong are trained to obtain a user classification model; according to the user classification model and each of the customer group models, the user to be identified is classified, and the customer to which the user to be identified is determined group.
- the classification weight under includes: for each sample user, determine the relative deviation value between the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group, as The classification weight of the sample user under different customer groups is obtained; the classification weight of each sample user under different customer groups is obtained respectively.
- determining the relative deviation value between the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group includes: for each customer group, calculate the Euclidean distance between the predicted safety score of the sample user under the customer group model corresponding to the customer group and the real safety score of the customer group, as the relative deviation value; respectively obtain the sample user in different Relative deviation value for the customer group.
- determining the true security score of each customer group based on the sample device data of the sample users in each customer group includes: for each customer group, based on the sample device data of the sample users in the customer group.
- Sample device data determine the total number of sample users in the customer group and the number of sample users who are unsafe users, and combine the number of sample users who are unsafe users with the total number of sample users The ratio of is used as the real safety score of the customer group; the real safety score of each customer group is obtained respectively.
- training to obtain a user classification model based on the classification weight of each sample user under different customer groups and the customer group to which each sample user belongs includes: constructing an initial user classification model; corresponding to each sample user Construct a classification vector with the number of the customer groups as the dimension; the elements in the classification vector correspond to the customer groups one-to-one; according to the customer group to which the sample user belongs, the corresponding elements in the classification vector are The value of is set to the preset minimum value, and the value of the other elements is set to the preset maximum value; the classification weight of the sample user under different customer groups is used as the input of the user classification model, and the sample user The classification vector corresponding to the user is used as the output, and the initial user classification model is trained to obtain the trained user classification model.
- classifying the user to be identified according to the user classification model and each of the customer group models, and determining the customer group to which the user to be identified includes: obtaining the device data of the user to be identified, and converting the device The data is input into each of the customer group models respectively, and corresponding predicted safety scores are obtained respectively; based on the predicted safety scores of the users to be identified under different customer group models and the corresponding real safety scores of the customer groups, the corresponding predicted safety scores are obtained. Describe the classification weight of the user to be identified under different customer groups; input the classification weight of the user to be identified under different customer groups into the user classification model for classification, and determine the category of the user to be identified according to the classification result user group.
- a user classification device based on smooth optimization of customer group deviations.
- the classification device includes: an acquisition module, which is used to obtain customer samples obtained by training sample device data of sample users belonging to different customer groups.
- the group model the predicted safety score acquisition module is used to input each sample device data into each of the customer group models to obtain the corresponding predicted safety scores respectively; the real safety score acquisition module is used to obtain the corresponding predicted safety score according to the sample users in each customer group.
- the sample equipment data determines the true safety score of each customer group; the classification weight acquisition module is used to calculate the predicted safety score of each sample user under different customer group models and the corresponding true safety score of the customer group.
- a user classification model is obtained; a classification module is used to classify users to be identified according to the user classification model and each of the customer group models, and determine the customer group to which the user to be identified belongs.
- the classification weight acquisition module is configured to: for each sample user, respectively determine the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group.
- the relative deviation value between the sample users is used as the classification weight of the sample user under different customer groups; the classification weight of each sample user under different customer groups is obtained respectively.
- determining the relative deviation value between the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group includes: for each customer group, calculate the Euclidean distance between the predicted safety score of the sample user under the customer group model corresponding to the customer group and the real safety score of the customer group, as the relative deviation value; respectively obtain the sample user in different Relative deviation value for the customer group.
- the real safety score acquisition module is configured to: for each of the customer groups, according to the Sample device data of sample users, determine the total number of sample users in the customer group and the number of sample users who are unsafe users, and combine the number of sample users who are unsafe users with the sample The ratio of the total number of users is used as the real safety score of the customer group; the real safety score of each customer group is obtained respectively.
- the training module is configured to: construct an initial user classification model; construct a classification vector with the number of the customer group as the dimension corresponding to each sample user; and elements in the classification vector are related to the customer group.
- One-to-one correspondence according to the customer group to which the sample user belongs, set the value of the corresponding element in the classification vector to a preset minimum value, and set the values of other elements to a preset maximum value; set the The classification weights of sample users under different customer groups are used as the input of the user classification model, and the classification vector corresponding to the sample user is used as the output to train the initial user classification model to obtain the trained user classification model.
- the classification module is configured to: obtain the device data of the user to be identified, input the device data into each of the customer group models, and obtain corresponding predicted safety scores respectively; according to the user to be identified in different
- the predicted safety score under the customer group model and the corresponding real safety score of the customer group are used to obtain the classification weight of the user to be identified under different customer groups; the user to be identified is classified into different customer groups.
- the classification weight under the group is input into the user classification model for classification, and the customer group to which the user to be identified belongs is determined based on the classification result.
- an electronic device which includes: one or more processors; a storage device for storing one or more programs; when one or more programs are processed by one or more processors Execution causes one or more processors to implement the method as above.
- a computer-readable medium on which a computer program is stored.
- the program is executed by a processor, the method as above is implemented.
- each sample device data is input into each customer group model respectively, corresponding predicted safety scores are obtained respectively, and the characteristics of each customer group are determined.
- the real safety score is based on the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group.
- the classification weight of each sample user under different customer groups is obtained.
- the classification weight and the customer group to which each sample user belongs are trained to obtain the user classification model.
- the user to be identified is classified and the customer group to which the user to be identified is determined.
- the customer group model performs deviation smoothing optimization to determine the customer group to which the user belongs, so as to determine the services provided to the user, which not only improves the user experience, but also improves the security of the services provided by the platform to the user.
- Figure 1 is a system block diagram of a user classification method and device based on smooth optimization of customer group deviations according to an exemplary embodiment.
- Figure 2 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to an exemplary embodiment.
- Figure 3 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- Figure 4 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- Figure 5 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- Figure 6 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- Figure 7 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- Figure 8 is a block diagram of a user classification device based on smooth optimization of customer group deviations according to an exemplary embodiment.
- FIG. 9 is a block diagram of an electronic device according to an exemplary embodiment.
- Figure 10 is a block diagram of a computer-readable medium according to an exemplary embodiment.
- Figure 1 is a system block diagram of a user classification method and device based on smooth optimization of customer group deviations according to an exemplary embodiment.
- system architecture 100 may include one or more of user devices 101 , 102 , 103 , a network 104 and a server 105 .
- Network 104 is the medium used to provide communication links between user devices 101, 102, 103 and server 105.
- Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
- the number of user equipment, networks and servers in Figure 1 is only illustrative. You can have any number of user devices, networks, and servers depending on your implementation needs.
- the server 105 may be a server cluster composed of multiple servers.
- the user devices 101, 102, and 103 may be various electronic devices with display screens, including but not limited to smartphones, tablet computers, portable computers, desktop computers, and the like.
- the user classification method based on smooth optimization of customer group deviation provided by the embodiment of the present invention is generally executed by the server 105.
- the device for user classification based on smooth optimization of customer group deviation is generally provided in the server 105.
- some terminals may have functions similar to those of the server to perform this method. Therefore, the user classification method based on smooth optimization of customer group deviation provided by the embodiment of the present invention is not limited to execution on the server side.
- Figure 2 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to an exemplary embodiment.
- the user classification method based on smooth optimization of customer group deviations includes steps S210 to S260.
- step S210 customer group models obtained by training sample device data of sample users belonging to different customer groups are obtained.
- the above-mentioned different customer groups can be obtained by manually labeling the sample users according to the sample device data. Based on the sample device data of sample users of different customer groups, customer group models corresponding to each customer group can be trained.
- customer group labels for different customer groups can be set according to actual business. For example, blacklist customer groups and whitelist customer groups. Another example is low-risk customer groups, medium-risk customer groups, and high-risk customer groups.
- the above sample device data may be data disclosed on the Internet service platform by sample users who use the sample device.
- public information such as user name, user age, user occupation, user income, user place of origin, the last time the user used the user device to log in to the system, etc., but is not limited to this, this solution can also be carried out only with user information that cannot identify the user's identity.
- Data processing such as age, education, household registration, etc., in order to protect user privacy; the user information can be deleted or anonymized by deleting or anonymizing information that can identify the user's identity to protect user privacy.
- the processing may be the processing of data by encryption means.
- step S220 each sample device data is input into each of the customer group models to obtain corresponding predicted safety scores.
- the predicted safety score of each sample device is output.
- This customer group model can be learned through existing neural network learning algorithms or decision tree learning algorithms. Models of each customer group.
- the above predicted safety score may be predicted by a customer group model based on the user attribute characteristics in the sample device data.
- step S230 the real security score of each customer group is determined based on the sample device data of the sample users in each customer group.
- the real security scores of each of the above customer groups may be determined based on the actual situation of each sample user in each customer group.
- the real safety score of each customer group is calculated based on the real labels in the sample device data of each customer group.
- the real labels in the sample device data of each customer group can be safe users and unsafe users.
- step S240 the classification weight of each sample user under different customer groups is obtained based on the predicted safety score of each sample user under different customer group models and the corresponding real safety score of the customer group.
- the safety score of each sample user under different customer groups can be calculated.
- Classification weight may include but is not limited to Euclidean distance and cosine distance.
- the classification weight of each sample user under different customer groups can represent the difference between the predicted safety score of the sample user under the customer group model and the real safety score of the sample user in the customer group.
- the greater the difference the greater the difference.
- the smaller the difference the smaller the difference between the predicted safety score obtained by the customer group model and the actual situation of the sample user.
- step S250 a user classification model is trained based on the classification weight of each sample user under different customer groups and the customer group to which each sample user belongs.
- the classification weight of each sample user under different customer groups and the customer group to which each sample user belongs are used as input, and the user classification model is learned through the existing neural network learning algorithm or decision tree learning algorithm.
- step S260 the user to be identified is classified according to the user classification model and each of the customer group models, and the customer group to which the user to be identified belongs is determined.
- the device data of the user to be identified is used as input, and each customer group model is input respectively to obtain the predicted safety scores of the user to be identified under different customer group models. Then, based on the predicted safety score of the user to be identified under different customer group models and the real safety score of the corresponding customer group, the classification weight of the user to be identified under different customer groups is calculated. The classification weight of the user to be identified under different customer groups is taken as input and input into the user classification model to obtain the probability that the user to be identified belongs to each customer group. Finally, based on the probability that the user to be identified belongs to each customer group, the customer group to which the user to be identified belongs is determined.
- the prediction results obtained by the customer group model are optimized through the difference between the real safety score of each customer group and the predicted safety score of the user to be identified under different customer group models. This difference can be used to calculate the predicted safety score of the user under different customer group models.
- the obtained prediction results are supplemented or corrected, thereby improving the accuracy of the classification results obtained through the above user classification model.
- the sample device data of sample users belonging to different customer groups are obtained and the customer group models obtained by training respectively are obtained, and each sample device data is input into each customer group model respectively.
- Based on the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group obtain the safety score of each sample user under different customer groups.
- Classification weight Based on the classification weight of each sample user under different customer groups and the customer group to which each sample user belongs, the user classification model is trained. Finally, based on the user classification model and each customer group model, the user to be identified is classified and the user is determined.
- this solution uses the built customer group model to perform deviation smoothing optimization to determine the customer group to which the user belongs, so as to determine the services to be provided to the user, which not only improves the user experience, but also improves the platform's ability to provide users with Security of the Service.
- Figure 3 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- step S240 may specifically include steps S310 to S320.
- step S310 for each sample user, determine the relative deviation value between the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group, as the The classification weight of sample users under different customer groups.
- the safety score of each sample user under different customer group models can be calculated.
- the preset method may include but is not limited to Euclidean distance and cosine distance.
- the relative deviation value can represent the difference between the predicted safety score of a sample user under the customer group model and the real safety score of the sample user in the customer group.
- the larger the difference the greater the predicted safety score obtained through the customer group model. The greater the difference from the real situation of this sample user.
- the smaller the difference the smaller the difference between the predicted safety score obtained by the customer group model and the actual situation of the sample user.
- step S320 the classification weight of each sample user under different customer groups is obtained.
- the classification weight of each sample user under different customer groups is calculated using the Euclidean distance method.
- the smaller the classification weight of each sample user under different customer groups the closer the predicted security score of the sample user under a customer group model is to the real security score of the customer group, that is, the user device of the sample user is consistent with the actual security score of the customer group.
- the more similar the user device data of the sample users under the customer group are.
- the greater the classification weight of each sample user under different customer groups the more obvious the difference between the predicted security score of the sample user under a customer group model and the real security score of the customer group, that is, the user device of the sample user is different from the real security score of the customer group.
- the more dissimilar the user device data of sample users under this customer group are.
- the classification weight of each sample user under different customer groups is calculated using cosine distance.
- the greater the classification weight of each sample user under different customer groups the closer the predicted security score of the sample user under a customer group model is to the real security score of the customer group, that is, the user device of the sample user is consistent with the actual security score of the customer group.
- the more similar the user device data of the sample users under the customer group are.
- the smaller the classification weight of each sample user under different customer groups the more obvious the difference between the predicted security score of the sample user under a customer group model and the real security score of the customer group, that is, the user device of the sample user is different from the real security score of the customer group.
- the more dissimilar the user device data of sample users under this customer group are.
- Figure 4 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- step S310 may specifically include steps S410 to S420.
- step S410 for each customer group, calculate the Euclidean distance between the predicted safety score of the sample user under the customer group model corresponding to the customer group and the real safety score of the customer group, as the relative Deviation.
- the Euclidean distance calculation formula is used to calculate the Euclidean distance between the sample user's predicted safety score under the customer group model corresponding to the customer group and the real safety score of the customer group.
- the larger the Euclidean distance is, the more obvious the difference between the predicted security score of the sample user under a customer group model and the real security score of the customer group, that is, the user equipment of the sample user is different from the user equipment of the sample user under the customer group.
- the preset maximum value of the Euclidean distance is 1, and the preset minimum value is 0.
- 0 means that the user equipment data of this sample user is most similar to the user equipment data of the sample user under this customer group, that is, the sample user belongs to this customer group.
- the most likely customer base. 1 means that the user device data of this sample user is the least similar to the user device data of sample users under this customer group, that is, the sample user is least likely to belong to this customer group.
- step S420 the relative deviation values of the sample users under different customer groups are respectively obtained.
- the smaller the relative deviation value of the sample user under different customer groups the closer the predicted security score of the sample user under a customer group model is to the real security score of the customer group, that is, the user of the sample user The more similar the device is to the user device data of sample users in this customer group.
- the larger the relative deviation value is, the more obvious the difference between the predicted security score of the sample user under a customer group model and the real security score of the customer group, that is, the user equipment of the sample user is different from the sample user under the customer group. The more dissimilar the user device data is.
- the cosine distance between the predicted safety score of the sample user under the customer group model corresponding to the customer group and the true safety score of the customer group can also be calculated as a relative deviation. value.
- the preset maximum value of the cosine distance is 1, and the preset minimum value is 0. 1 means that the user equipment data of the sample user is most similar to the user equipment data of the sample user under this customer group, that is, the sample user belongs to this customer group.
- the most likely customer base. 0 means that the user device data of this sample user is the least similar to the user device data of sample users under this customer group, that is, the sample user is least likely to belong to this customer group.
- Figure 5 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- step S230 may specifically include steps S510 to S520.
- step S510 for each customer group, based on the sample device data of the sample users in the customer group, determine the total number of sample users in the customer group and the samples that are unsafe users.
- the number of users is the ratio of the number of sample users who are unsafe users to the total number of sample users as the true security score of the customer group.
- the sample device data of the sample user in each customer group contains the real label labeled for the sample user, such as a safe user or an unsafe user.
- the real labels in the sample device data the number of sample users who are unsafe users in each customer group can be counted, and then based on the total number of sample users in each customer group and the number of sample users who are unsafe users, calculate each The actual safety score of each customer group.
- step S520 the real safety scores of each customer group are obtained.
- the real security scores of each customer group can be used to optimize the above-mentioned customer group models to further obtain a user classification model.
- Figure 6 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- step S250 may specifically include steps S610 to S640.
- step S610 an initial user classification model is constructed.
- step S620 a classification vector with the number of the customer groups as the dimension is constructed for each sample user.
- the elements in the above classification vector correspond to each customer group one-to-one.
- the number of customer groups is 3, and the customer groups are A, B, and C respectively.
- the classification vector is a three-dimensional vector (a, b, c). Among them, a represents the probability that the sample user belongs to customer group A, b represents the probability that the sample user belongs to customer group B, and c represents the probability that the sample user belongs to customer group C.
- step S630 according to the customer group to which the sample user belongs, the value of the corresponding element in the classification vector is set to a preset minimum value, and the values of other elements are set to a preset maximum value.
- the above-mentioned preset minimum value indicates that the sample user belongs to the customer group corresponding to the element.
- the above preset maximum value indicates that the sample user does not belong to the customer group corresponding to this element. Assume that the minimum preset minimum value is 0, and the maximum preset maximum value is 1. 0 means that the sample user belongs to the customer group corresponding to the element, and 1 means that the sample user does not belong to the customer group corresponding to the element.
- the above-mentioned preset minimum value indicates that the sample user does not belong to the customer group corresponding to the element.
- the above preset maximum value indicates that the sample user belongs to the customer group corresponding to the element. Assume that the minimum preset minimum value is 0 and the maximum preset maximum value is 1. 0 means that the sample user does not belong to the customer group corresponding to the element, and 1 means that the sample user belongs to the customer group corresponding to the element.
- the minimum preset minimum value can be set to 0, and the maximum preset maximum value can be set to 1. Of course, it can also be set according to the actual situation.
- step S640 the classification weight of the sample user under different customer groups is used as the input of the user classification model, the classification vector corresponding to the sample user is used as the output, and the initial user classification model is trained to obtain the trained The user classification model.
- the user classification model obtained through the above training method further optimizes the existing customer group model, so that the classification results obtained by classifying the users to be identified through the user classification model are more prepared.
- Figure 7 is a flow chart of a user classification method based on smooth optimization of customer group deviation according to another exemplary embodiment.
- step S260 may specifically include steps S710 to S730.
- step S710 the device data of the user to be identified is obtained, the device data is input into each of the customer group models, and corresponding predicted safety scores are obtained respectively.
- the user equipment of the user to be identified is predicted through each customer group model, and the predicted safety score of the user to be identified under different customer group models is obtained.
- step S720 according to the predicted safety scores of the user to be identified under different customer group models and the corresponding real safety scores of the customer group, the classification of the user to be identified under different customer groups is obtained. Weights.
- the safety score of the user to be identified under different customer groups can be calculated. classification weight.
- step S730 the classification weights of the user to be identified under different customer groups are input into the user classification model for classification, and the customer group to which the user to be identified belongs is determined based on the classification results.
- the classification weights of the users to be identified under different customer groups are used as input to the user classification model, and the user classification model outputs a classification vector. According to the value of each element in the classification vector, the customer group to which the user to be identified belongs is determined. The classification results obtained in this way are more prepared, making the classification results more consistent with the real situation of the user to be identified, thereby improving the user experience.
- FIG. 8 is a block diagram of a user classification device based on smooth optimization of customer group deviation according to another exemplary embodiment.
- the above-mentioned user classification device 800 based on customer group deviation smooth optimization includes: acquisition module 810, predicted safety score acquisition module 820, real safety score acquisition module 830, classification weight acquisition module 840, training module 850 and classification module 860.
- the acquisition module 810 is used to acquire customer group models obtained by separately training sample device data of sample users belonging to different customer groups.
- the predicted safety score acquisition module 820 is used to input each sample device data into each of the customer group models to obtain corresponding predicted safety scores.
- the real safety score acquisition module 830 is used to determine the real safety score of each customer group based on the sample device data of sample users in each customer group.
- the classification weight acquisition module 840 is used to obtain the predicted safety score of each sample user under different customer group models and the corresponding real safety score of the customer group. The classification weight under the customer group.
- the training module 850 is used to train and obtain a user classification model based on the classification weight of each sample user under different customer groups and the customer group to which each sample user belongs.
- the classification module 860 is configured to classify users to be identified according to the user classification model and each of the customer group models, and determine the customer group to which the user to be identified belongs.
- the user classification device 800 based on the smooth optimization of customer group deviations can input each sample device data into each customer group model to obtain the corresponding predicted safety score, and then determine each customer group based on the sample device data of the sample users in each customer group. According to the predicted safety score of each sample user under different customer group models and the corresponding real safety score of the customer group, the real safety score of each sample user under different customer groups is obtained. The classification weight of each sample user under different customer groups and the customer group to which each sample user belongs are trained to obtain a user classification model. Finally, based on the user classification model and each customer group model, the user to be identified is Classify and determine the customer group to which the user to be identified belongs. The classification results obtained in this way are more prepared, making the classification results more consistent with the real situation of the user to be identified, thus improving the user experience.
- the user classification device 800 based on smooth optimization of customer group deviations can be used to implement the user classification method based on smooth optimization of customer group deviations described in the embodiment of FIG. 2 .
- the classification weight acquisition module 840 is configured to: for each sample user, determine the predicted safety score of the sample user under different customer group models and the corresponding real safety score of the customer group. The relative deviation value between them is used as the classification weight of the sample user under different customer groups; the classification weight of each sample user under different customer groups is obtained respectively.
- determining the relative deviation value between the predicted safety score of the sample user under different customer group models and the real safety score of the corresponding customer group includes: for each customer group, calculate the Euclidean distance between the predicted safety score of the sample user under the customer group model corresponding to the customer group and the real safety score of the customer group, as the relative deviation value; respectively obtain the sample user in different Relative deviation value for the customer group.
- the real security score acquisition module 830 is configured to: for each customer group, determine the sample user in the customer group based on the sample device data of the sample user in the customer group The total number and the number of sample users who are unsafe users, the ratio of the number of sample users who are unsafe users to the total number of sample users is used as the true security score of the customer group; respectively Obtain the true safety score of each customer group.
- the training module 850 is configured to: construct an initial user classification model; construct a classification vector with the number of the customer groups as the dimension corresponding to each sample user; and elements in the classification vector are consistent with the customer groups. Group one-to-one correspondence; according to the customer group to which the sample user belongs, set the value of the corresponding element in the classification vector to the preset minimum value, and set the values of other elements to the preset maximum value; set all The classification weights of the sample users under different customer groups are used as the input of the user classification model, and the classification vector corresponding to the sample user is used as the output to train the initial user classification model to obtain the trained user classification model.
- the classification module 860 is configured to: obtain the device data of the user to be identified, input the device data into each of the customer group models, and obtain corresponding predicted safety scores respectively; The predicted safety scores under different customer group models and the corresponding real safety scores of the customer groups are used to obtain the user to be identified under different customer groups.
- Classification weight input the classification weight of the user to be identified under different customer groups into the user classification model for classification, and determine the customer group to which the user to be identified belongs based on the classification results.
- FIG. 9 is a block diagram of an electronic device according to an exemplary embodiment.
- FIG. 9 An electronic device 900 according to this embodiment of the present disclosure is described below with reference to FIG. 9 .
- the electronic device 9 shown in FIG. 9 is only an example and should not bring any limitations to the functions and scope of use of the embodiments of the present disclosure.
- electronic device 900 is embodied in the form of a general computing device.
- the components of the electronic device 900 may include, but are not limited to: at least one processing unit 910, at least one storage unit 920, a bus 930 connecting different system components (including the storage unit 920 and the processing unit 910), a display unit 940, and the like.
- the storage unit stores program code, and the program code can be executed by the processing unit 910, so that the processing unit 910 performs the steps in this specification according to various exemplary embodiments of the present disclosure.
- the processing unit 910 may perform the steps shown in FIGS. 2 to 7 .
- the storage unit 920 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 9201 and/or a cache storage unit 9202, and may further include a read-only storage unit (ROM) 9203.
- RAM random access storage unit
- ROM read-only storage unit
- the storage unit 920 may also include a program/utility 9204 having a set of (at least one) program modules 9205 including, but not limited to: an operating system, one or more applications, other program modules, and programs. Data, each of these examples or some combination may include an implementation of a network environment.
- Bus 930 may be a local area representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or using any of a variety of bus structures. bus.
- Electronic device 900 may also communicate with one or more external devices 900 (e.g., keyboard, pointing device, Bluetooth device, etc.) so that the user can communicate with the device that the electronic device 900 interacts with, and/or the electronic device 900 can communicate with one or more external devices 900 . Any device (such as a router, modem, etc.) with which multiple other computing devices communicate. This communication may occur through an input/output (I/O) interface 950.
- the electronic device 900 may also communicate with one or more networks (eg, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 960.
- Network adapter 960 may communicate with other modules of electronic device 900 via bus 930.
- electronic device 900 may be used in conjunction with electronic device 900, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
- the technical solution according to the embodiment of the present disclosure can be embodied in the form of a software product.
- the software product can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk etc.) or on a network, including several instructions to cause a computing device (which may be a personal computer, a server, a network device, etc.) to execute the above method according to an embodiment of the present disclosure.
- the software product may take the form of any combination of one or more readable media.
- the readable medium may be a readable signal medium or a readable storage medium.
- the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of readable storage media include: electrical connection with one or more conductors, portable disk, hard disk, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
- the computer-readable storage medium may include a data signal propagated in baseband or as part of a carrier wave carrying readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
- a readable storage medium may also be any readable medium other than a readable storage medium that can transmit, propagate, or transport the program for use by or in connection with an instruction execution system, apparatus, or device.
- Program code contained on a readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.
- Program code for performing operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., as well as conventional procedural Programming language—such as "C" or a similar programming language.
- the program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
- the remote computing device may be connected to the user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device, such as provided by an Internet service. (business comes via Internet connection).
- LAN local area network
- WAN wide area network
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本公开涉及一种基于客群偏差平滑优化的用户分类方法,该方法包括将各个样本设备数据分别输入各个客群模型,分别得到相应的预测安全评分,确定各个客群的真实安全评分,根据样本用户在不同客群模型下的预测安全评分和对应的客群的真实安全评分,得到每个样本用户在不同客群下的分类权重,根据各个样本用户在不同客群下的分类权重和各个样本用户所属的客群,训练得到用户分类模型,最后根据用户分类模型和各个客群模型,实现对待识别用户进行分类,确定该待识别用户所属客群,本方案通过已构建的客群模型进行偏差平滑优化确定用户所属客群,以便于确定为用户提供的服务,既提高了用户使用体验,也提高了平台对用户提供的服务的安全性。
Description
本公开涉及设备数据处理领域,具体而言,涉及一种基于客群偏差平滑优化的用户分类方法、装置、电子设备及计算机可读介质。
随着互联网的快速发展,互联网服务平台中有大量的用户设备数据。该互联网服务平台可以通过分类的方式来管理大量的用户设备数据。例如,先对大量的用户设备数据进行分群,得到多个不同的客群。然后基于不同客群中的用户设备数据训练模型,得到多个客群模型。当获取到新的用户设备数据时,可以通过多个客群模型分别预测该用户设备数据,并根据预测结果确定该用户所属的客群。
但是,发明人在实现本发明的发明构思时发现,相关技术至少存在一下技术问题:通过上述方式预测用户所属客群获取的结果准备度较低,从而导致部分用户被划分到与真实情况不符的客群,降低用户体验。
在所述背景技术部分公开的上述信息仅用于加强对本公开的背景的理解,因此它可以包括不构成对本领域普通技术人员已知的现有技术的信息。
发明内容
有鉴于此,本公开提供一种基于客群偏差平滑优化的用户分类方法、装置、电子设备及计算机可读介质,能够通过客群模型和用户分类模型来确定用户所属的客群,以此方式获取的分类结果更加准备,使得分类结果与该待识别用户真实情况更加相符,从而提高用户体验。
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。
根据本公开的一方面,提出一种基于客群偏差平滑优化的用户分类方法,所述分类方法包括:获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型;将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分;根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重;根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型;根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
可选地,所述根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重,包括:针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重;分别得到每个所述样本用户在不同所述客群下的分类权重。
可选地,所述分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,包括:针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值;分别得到所述样本用户在不同所述客群下的相对偏差值。
可选地,所述根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分,包括:针对每个所述客群,根据所述客群中的所述样本用户的样本设备数据,确定所述客群中所述样本用户的总数和为不安全用户的所述样本用户的数量,将所述为不安全用户的所述样本用户的数量和所述样本用户的总数的比值,作为所述客群的真实安全评分;分别得到各个所述客群的真实安全评分。
可选地,所述根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型,包括:构建初始用户分类模型;对应每个样本用户分别构建以所述客群的数量为维度的分类向量;所述分类向量中的元素与所述客群一一对应;根据所述样本用户所属的客群,将所述分类向量中对应的元素的值设置为预设极小值,其他所述元素的值设置为预设极大值;将所述样本用户在不同客群下的分类权重作为所述用户分类模型的输入,将所述样本用户对应的分类向量作为输出,训练所述初始用户分类模型,得到训练好的所述用户分类模型。
可选地,所述根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群,包括:获取待识别用户的设备数据,将所述设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据所述待识别用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到所述待识别用户在不同所述客群下的分类权重;将所述待识别用户在不同所述客群下的分类权重输入所述用户分类模型进行分类,根据分类结果确定所述待识别用户所属客群。
根据本公开的一方面,提出一种基于客群偏差平滑优化的用户分类装置,所述分类装置包括:获取模块,用于获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型;预测安全评分获取模块,用于将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;真实安全评分获取模块,用于根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分;分类权重获取模块,用于根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重;训练模块,用于根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型;分类模块,用于根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
可选地,所述分类权重获取模块被配置为:针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重;分别得到每个所述样本用户在不同所述客群下的分类权重。
可选地,所述分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,包括:针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值;分别得到所述样本用户在不同所述客群下的相对偏差值。
可选地,所述真实安全评分获取模块被配置为:针对每个所述客群,根据所述客群中的所述
样本用户的样本设备数据,确定所述客群中所述样本用户的总数和为不安全用户的所述样本用户的数量,将所述为不安全用户的所述样本用户的数量和所述样本用户的总数的比值,作为所述客群的真实安全评分;分别得到各个所述客群的真实安全评分。
可选地,所述训练模块被配置为:构建初始用户分类模型;对应每个样本用户分别构建以所述客群的数量为维度的分类向量;所述分类向量中的元素与所述客群一一对应;根据所述样本用户所属的客群,将所述分类向量中对应的元素的值设置为预设极小值,其他所述元素的值设置为预设极大值;将所述样本用户在不同客群下的分类权重作为所述用户分类模型的输入,将所述样本用户对应的分类向量作为输出,训练所述初始用户分类模型,得到训练好的所述用户分类模型。
可选地,所述分类模块被配置为:获取待识别用户的设备数据,将所述设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据所述待识别用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到所述待识别用户在不同所述客群下的分类权重;将所述待识别用户在不同所述客群下的分类权重输入所述用户分类模型进行分类,根据分类结果确定所述待识别用户所属客群。
根据本公开的一方面,提出一种电子设备,该电子设备包括:一个或多个处理器;存储装置,用于存储一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如上文的方法。
根据本公开的一方面,提出一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如上文中的方法。
根据本公开的基于客群偏差平滑优化的用户分类方法、装置、电子设备及计算机可读介质,将各个样本设备数据分别输入各个客群模型,分别得到相应的预测安全评分,确定各个客群的真实安全评分,根据样本用户在不同客群模型下的预测安全评分和对应的客群的真实安全评分,得到每个样本用户在不同客群下的分类权重,根据各个样本用户在不同客群下的分类权重和各个样本用户所属的客群,训练得到用户分类模型,最后根据用户分类模型和各个客群模型,实现对待识别用户进行分类,确定该待识别用户所属客群,本方案通过已构建的客群模型进行偏差平滑优化确定用户所属客群,以便于确定为用户提供的服务,既提高了用户使用体验,也提高了平台对用户提供的服务的安全性。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
通过参照附图详细描述其示例实施例,本公开的上述和其它目标、特征及优点将变得更加显而易见。下面描述的附图仅仅是本公开的一些实施例,对于本领域的普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是根据一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法及装置的系统框图。
图2是根据一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图3是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图4是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图5是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图6是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图7是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
图8是根据一示例性实施例示出的一种基于客群偏差平滑优化的用户分类装置的框图。
图9是根据一示例性实施例示出的一种电子设备的框图。
图10是根据一示例性实施例示出的一种计算机可读介质的框图。
现在将参考附图更全面地描述示例实施例。
图1是根据一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法及装置的系统框图。
如图1所示,系统架构100可以包括用户设备101、102、103中的一种或多种,网络104和服务器105。网络104用以在用户设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
应该理解,图1中的用户设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的用户设备、网络和服务器。比如服务器105可以是多个服务器组成的服务器集群等。
用户可以使用用户设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。用户设备101、102、103可以是具有显示屏的各种电子设备,包括但不限于智能手机、平板电脑、便携式计算机和台式计算机等等。
在一些实施例中,本发明实施例所提供的基于客群偏差平滑优化的用户分类方法一般由服务器105执行,相应地,基于客群偏差平滑优化的用户分类的装置一般设置于服务器105中。在另一些实施例中,某些终端可以具有与服务器相似的功能从而执行本方法。因此,本发明实施例所提供的基于客群偏差平滑优化的用户分类方法不限定在服务器端执行。
图2是根据一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图2所示,基于客群偏差平滑优化的用户分类方法包括步骤S210~步骤S260。
在步骤S210中,获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型。
在本步骤中,上述不同客群可以是通过人工标注的方式,根据样本设备数据对样本用户进行分群得到的。基于不同客群的样本用户的样本设备数据可以训练得到与各客群对应的客群模型。
在本步骤中,不同客群的客群标签可以根据实际业务进行设置。例如,黑名单客群、白名单客群。再例如,低风险客群、中风险客群、高风险客群。
在本步骤中,上述样本设备数据可以是使用该样本设备的样本用户在互联网服务平台公开的数据。比如,用户名称、用户年龄、用户职业、用户收入、用户籍贯、用户上次使用用户设备登录系统的时间等公开信息,但不限于此,还可以仅通过无法识别用户身份的用户信息进行本方案的数据处理,比如,年龄、学历、户籍等,以实现对于保护用户隐私;可以采用对用户信息中可以识别出用户身份的信息删除或者匿名化处理的方式来实现对于用户隐私的保护,匿名化处理可以是通过加密手段对数据进行处理。
在步骤S220中,将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分。
在本步骤中,通过将各个样本设备数据作为输入,分别输入各客群模型,输出各个样本设备的预测安全评分。该客群模型可以通过现有的神经网络学习算法或者决策树学习算法,学习得到
各客群模型。
在本步骤中,上述预测安全评分可以是通过客群模型根据样本设备数据中的用户属性特征进行预测得到的。
在步骤S230中,根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分。
在本步骤中,上述各个客群的真实安全评分可以是根据各个客群中各样本用户的真实情况确定的。例如,根据各个客群的样本设备数据中的真实标签,计算各个客群的真实安全评分。
在本步骤中,各个客群的样本设备数据中的真实标签可以是安全用户和不安全用户。
在步骤S240中,根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个样本用户在不同客群下的分类权重。
在本步骤中,通过预设方式,基于每个样本用户在不同客群模型下的预测安全评分和对应的所述客群的真实安全评分,可以计算出每个样本用户在不同客群下的分类权重。该预设方式可以包括但不限于欧氏距离和余弦值距离。
在本步骤中,每个样本用户在不同客群下的分类权重可以表征该样本用户在该客群模型下的预测安全评分和该客群中样本用户的真实安全评分的差异,差异越大表示通过客群模型得到的预测安全评分与该样本用户的真实情况相差越大。相反,差异越小表示通过客群模型得到的预测安全评分与该样本用户的真实情况相差越小。
在步骤S250中,根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型。
在本步骤中,将各个样本用户在不同客群下的分类权重和各个样本用户所属的客群作为输入,通过现有的神经网络学习算法或者决策树学习算法,学习得到该用户分类模型。
在步骤S260中,根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
在本步骤中,将待识别用户的设备数据作为输入,分别输入各个客群模型,分别得到该待识别用户在不同客群模型下的预测安全评分。然后根据该待识别用户在不同客群模型下的预测安全评分和对应客群的真实安全评分,计算得到该待识别用户在不同客群下的分类权重。并将待识别用户在不同客群下的分类权重作为输入,输入到用户分类模型,得到该待识别用户属于每个客群的概率。最后根据该待识别用户属于每个客群的概率,确定该待识别用户所属客群。
在本步骤中,通过各客群的真实安全评分与该待识别用户在不同客群模型下的预测安全评分的差异,来优化通过客群模型得到的预测结果,该差异可以对通过客群模型得到的预测结果进行补充或修正,从而提升通过上述用户分类模型获取的分类结果的准确度。
通过本公开提供的基于客群偏差平滑优化的用户分类方法,获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型,将各个样本设备数据分别输入各个客群模型,分别得到相应的预测安全评分,确定各个客群的真实安全评分,根据样本用户在不同客群模型下的预测安全评分和对应的客群的真实安全评分,得到每个样本用户在不同客群下的分类权重,根据各个样本用户在不同客群下的分类权重和各个样本用户所属的客群,训练得到用户分类模型,最后根据用户分类模型和各个客群模型,实现对待识别用户进行分类,确定该待识别用户所属客群,本方案通过已构建的客群模型进行偏差平滑优化确定用户所属客群,以便于确定为用户提供的服务,既提高了用户使用体验,也提高了平台对用户提供的服务的安全性。
图3是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图3所示,上述步骤S240具体可以包括步骤S310~S320。
在步骤S310中,针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重。
在本步骤中,通过预设方式,基于每个样本用户在不同客群模型下的预测安全评分和对应的所述客群的真实安全评分,可以计算每个样本用户在不同客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值。该预设方式可以包括但不限于欧氏距离和余弦值距离。
在本步骤中,该相对偏差值可以表征一样本用户在客群模型下的预测安全评分和该客群中样本用户的真实安全评分的差异,差异越大表示通过客群模型得到的预测安全评分与该样本用户的真实情况相差越大。相反,差异越小表示通过客群模型得到的预测安全评分与该样本用户的真实情况相差越小。
在步骤S320中,分别得到每个所述样本用户在不同所述客群下的分类权重。
在本步骤中,如果每个样本用户在不同客群下的分类权重是采用欧式距离的方式计算得到。其中,每个样本用户在不同客群下的分类权重越小表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分越接近,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越相似。相反,每个样本用户在不同客群下的分类权重越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分差别越明显,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越不相似。
在本步骤中,如果每个样本用户在不同客群下的分类权重是采用余弦距离的方式计算得到。其中,每个样本用户在不同客群下的分类权重越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分越接近,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越相似。相反,每个样本用户在不同客群下的分类权重越小表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分差别越明显,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越不相似。
图4是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图4所示,上述步骤S310具体可以包括步骤S410~步骤S420。
在步骤S410中,针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值。
在本步骤中,通过欧式距离计算公式,根据该样本用户在客群对应的客群模型下的预测安全评分和该客群的真实安全评分,计算两者的欧式距离。该欧式距离越小表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分越接近,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越相似。相反,该欧式距离越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分差别越明显,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越不相似。例如,欧式距离的预设极大值为1,预设最小值为0。0代表该样本用户的用户设备数据与该客群下的样本用户的用户设备数据最相似,即该样本用户属于该客群的可能性最大。1代表该样本用户的用户设备数据与该客群下的样本用户的用户设备数据最不相似,即该样本用户属于该客群的可能性最小。
在步骤S420中,分别得到所述样本用户在不同所述客群下的相对偏差值。
在本步骤中,该样本用户在不同客群下的相对偏差值越小表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分越接近,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越相似。相反,该相对偏差值越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分差别越明显,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越不相似。
在本发明的一些实施例中,针对每个所述客群,还可以计算样本用户在客群对应的客群模型下的预测安全评分和该客群的真实安全评分的余弦距离,作为相对偏差值。其中,该余弦距离越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分越接近,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越相似。相反,该余弦距离越大表示该样本用户在一客群模型下的预测安全评分与该客群的真实安全评分差别越明显,即该样本用户的用户设备与该客群下的样本用户的用户设备数据越不相似。例如,余弦距离的预设极大值为1,预设最小值为0。1代表该样本用户的用户设备数据与该客群下的样本用户的用户设备数据最相似,即该样本用户属于该客群的可能性最大。0代表该样本用户的用户设备数据与该客群下的样本用户的用户设备数据最不相似,即该样本用户属于该客群的可能性最小。
图5是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图5所示,上述步骤S230具体可以包括步骤S510~步骤S520。
在步骤S510中,针对每个所述客群,根据所述客群中的所述样本用户的样本设备数据,确定所述客群中所述样本用户的总数和为不安全用户的所述样本用户的数量,将所述为不安全用户的所述样本用户的数量和所述样本用户的总数的比值,作为所述客群的真实安全评分。
在本步骤中,每个客群中样本用户的样本设备数据中包含了针对该样本用户标注的真实标签,例如安全用户或不安全用户。根据样本设备数据中的真实标签,可以统计出每个客群中为不安全用户的样本用户的数量,然后根据每个客群中样本用户的总数和不安全用户的样本用户的数量,计算每个客群的真实安全评分。
在步骤S520中,分别得到各个所述客群的真实安全评分。
在本步骤中,各个客群的真实安全评分可以用于优化上述各个客群模型,进一步得到用户分类模型。
图6是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图6所示,上述步骤S250具体可以包括步骤S610~S640。
在步骤S610中,构建初始用户分类模型。
在步骤S620中,对应每个样本用户分别构建以所述客群的数量为维度的分类向量。
在本步骤中,上述分类向量中的元素与各个客群一一对应。例如,客群数量为3,客群分别有A、B、C。分类向量是一个三维的向量(a,b,c)。其中,a表示样本用户属于A客群的概率,b表示样本用户属于B客群的概率,c表示样本用户属于c客群的概率。
在步骤S630中,根据所述样本用户所属的客群,将所述分类向量中对应的元素的值设置为预设极小值,其他所述元素的值设置为预设极大值。
在本步骤中,针对欧式距离,上述预设极小值表示该样本用户属于该元素对应的客群。上述预设极大值表示该样本用户不属于该元素对应的客群。假设预设极小值最小为0,预设极大值最大为1。0表示该样本用户属于该元素对应的客群,1表示该样本用户不属于该元素对应的客群。
在本步骤中,针对余弦距离,上述预设极小值表示该样本用户不属于该元素对应的客群。上述预设极大值表示该样本用户属于该元素对应的客群。假设预设极小值最小为0,预设极大值最大为1。0表示该样本用户不属于该元素对应的客群,1表示该样本用户属于该元素对应的客群。
在本步骤中,预设极小值最小可以设置为0,预设极大值最大可以设置为1。当然也可以根据实际情况认为设置。
在步骤S640中,将所述样本用户在不同客群下的分类权重作为所述用户分类模型的输入,将所述样本用户对应的分类向量作为输出,训练所述初始用户分类模型,得到训练好的所述用户分类模型。
通过上述训练方式得到的用户分类模型进一步优化了现有客群模型,以使得通过该用户分类模型对待识别用户进行分类获取的分类结果更加准备。
图7是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类方法的流程图。
如图7所示,上述步骤S260具体可以包括步骤S710~S730。
在步骤S710中,获取待识别用户的设备数据,将所述设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分。
在本步骤中,通过各个客群模型,分别对该待识别用户的用户设备进行预测,获取该待识别用户在不同客群模型下的预测安全评分。
在步骤S720中,根据所述待识别用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到所述待识别用户在不同所述客群下的分类权重。
在本步骤中,通过欧式距离或余弦距离的计算公式,根据待识别用户在不同客群模型下的预测安全评分和对应的客群的真实安全评分,可以计算得到待识别用户在不同客群下的分类权重。
在步骤S730中,将所述待识别用户在不同所述客群下的分类权重输入所述用户分类模型进行分类,根据分类结果确定所述待识别用户所属客群。
在本步骤中,将待识别用户在不同客群下的分类权重作为输入,输入到用户分类模型,该用户分类模型输出一分类向量。根据该分类向量中各元素的值,确定该待识别用户所属的客群,以此方式获取的分类结果更加准备,使得分类结果与该待识别用户真实情况更加相符,从而提高用户体验。
下述为本公开装置实施例,可以用于执行本公开方法实施例。对于本公开装置实施例中未披露的细节,请参照本公开方法实施例。
图8是根据另一示例性实施例示出的一种基于客群偏差平滑优化的用户分类装置的框图。
如图8所示,上述基于客群偏差平滑优化的用户分类装置800包括:获取模块810、预测安全评分获取模块820、真实安全评分获取模块830、分类权重获取模块840、训练模块850和分类模块860。
具体地,获取模块810,用于获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型。
预测安全评分获取模块820,用于将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分。
真实安全评分获取模块830,用于根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分。
分类权重获取模块840,用于根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重。
训练模块850,用于根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型。
分类模块860,用于根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
该基于客群偏差平滑优化的用户分类装置800可以各个样本设备数据分别输入各个客群模型,分别得到相应的预测安全评分,然后根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分,根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重,根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型,最后根据用户分类模型和各个客群模型,对待识别用户进行分类,确定该待识别用户所属客群,以此方式获取的分类结果更加准备,使得分类结果与该待识别用户真实情况更加相符,从而提高用户体验。
根据本发明的实施例,该基于客群偏差平滑优化的用户分类装置800可以用于实现图2实施例描述的基于客群偏差平滑优化的用户分类方法。
可选地,所述分类权重获取模块840被配置为:针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重;分别得到每个所述样本用户在不同所述客群下的分类权重。
可选地,所述分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,包括:针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值;分别得到所述样本用户在不同所述客群下的相对偏差值。
可选地,所述真实安全评分获取模块830被配置为:针对每个所述客群,根据所述客群中的所述样本用户的样本设备数据,确定所述客群中所述样本用户的总数和为不安全用户的所述样本用户的数量,将所述为不安全用户的所述样本用户的数量和所述样本用户的总数的比值,作为所述客群的真实安全评分;分别得到各个所述客群的真实安全评分。
可选地,所述训练模块850被配置为:构建初始用户分类模型;对应每个样本用户分别构建以所述客群的数量为维度的分类向量;所述分类向量中的元素与所述客群一一对应;根据所述样本用户所属的客群,将所述分类向量中对应的元素的值设置为预设极小值,其他所述元素的值设置为预设极大值;将所述样本用户在不同客群下的分类权重作为所述用户分类模型的输入,将所述样本用户对应的分类向量作为输出,训练所述初始用户分类模型,得到训练好的所述用户分类模型。
可选地,所述分类模块860被配置为:获取待识别用户的设备数据,将所述设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据所述待识别用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到所述待识别用户在不同所述客群下的
分类权重;将所述待识别用户在不同所述客群下的分类权重输入所述用户分类模型进行分类,根据分类结果确定所述待识别用户所属客群。
图9是根据一示例性实施例示出的一种电子设备的框图。
下面参照图9来描述根据本公开的这种实施方式的电子设备900。图9显示的电子设备9仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。
如图9所示,电子设备900以通用计算设备的形式表现。电子设备900的组件可以包括但不限于:至少一个处理单元910、至少一个存储单元920、连接不同系统组件(包括存储单元920和处理单元910)的总线930、显示单元940等。
其中,所述存储单元存储有程序代码,所述程序代码可以被所述处理单元910执行,使得所述处理单元910执行本说明书中的根据本公开各种示例性实施方式的步骤。例如,所述处理单元910可以执行如图2~图7中所示的步骤。
所述存储单元920可以包括易失性存储单元形式的可读介质,例如随机存取存储单元(RAM)9201和/或高速缓存存储单元9202,还可以进一步包括只读存储单元(ROM)9203。
所述存储单元920还可以包括具有一组(至少一个)程序模块9205的程序/实用工具9204,这样的程序模块9205包括但不限于:操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或某种组合中可能包括网络环境的实现。
总线930可以为表示几类总线结构中的一种或多种,包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。
电子设备900也可以与一个或多个外部设备900(例如键盘、指向设备、蓝牙设备等)通信,使得用户能与该电子设备900交互的设备通信,和/或该电子设备900能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口950进行。并且,电子设备900还可以通过网络适配器960与一个或者多个网络(例如局域网(LAN),广域网(WAN)和/或公共网络,例如因特网)通信。网络适配器960可以通过总线930与电子设备900的其它模块通信。应当明白,尽管图中未示出,可以结合电子设备900使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
通过以上的实施方式的描述,本领域的技术人员易于理解,这里描述的示例实施方式可以通过软件实现,也可以通过软件结合必要的硬件的方式来实现。因此,如图10所示,根据本公开实施方式的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中或网络上,包括若干指令以使得一台计算设备(可以是个人计算机、服务器、或者网络设备等)执行根据本公开实施方式的上述方法。
所述软件产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
所述计算机可读存储介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上
述的任意合适的组合。可读存储介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
以上具体地示出和描述了本公开的示例性实施例。应可理解的是,本公开不限于这里描述的详细结构、设置方式或实现方法;相反,本公开意图涵盖包含在所附权利要求的精神和范围内的各种修改和等效设置。
Claims (11)
- 一种基于客群偏差平滑优化的用户分类方法,其特征在于,所述分类方法包括:获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型;将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分;根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重;根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型;根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
- 根据权利要求1所述的用户分类方法,其特征在于,所述根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重,包括:针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重;分别得到每个所述样本用户在不同所述客群下的分类权重。
- 根据权利要求2所述的用户分类方法,其特征在于,所述分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,包括:针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值;分别得到所述样本用户在不同所述客群下的相对偏差值。
- 根据权利要求2所述的用户分类方法,其特征在于,所述根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分,包括:针对每个所述客群,根据所述客群中的所述样本用户的样本设备数据,确定所述客群中所述样本用户的总数和为不安全用户的所述样本用户的数量,将所述为不安全用户的所述样本用户的数量和所述样本用户的总数的比值,作为所述客群的真实安全评分;分别得到各个所述客群的真实安全评分。
- 根据权利要求1所述的用户分类方法,其特征在于,所述根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型,包括:构建初始用户分类模型;对应每个样本用户分别构建以所述客群的数量为维度的分类向量;所述分类向量中的元素与所述客群一一对应;根据所述样本用户所属的客群,将所述分类向量中对应的元素的值设置为预设极小值,其他所述元素的值设置为预设极大值;将所述样本用户在不同客群下的分类权重作为所述用户分类模型的输入,将所述样本用户对应的分类向量作为输出,训练所述初始用户分类模型,得到训练好的所述用户分类模型。
- 根据权利要求1所述的用户分类方法,其特征在于,所述根据所述用户分类模型和各个 所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群,包括:获取待识别用户的设备数据,将所述设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;根据所述待识别用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到所述待识别用户在不同所述客群下的分类权重;将所述待识别用户在不同所述客群下的分类权重输入所述用户分类模型进行分类,根据分类结果确定所述待识别用户所属客群。
- 一种基于客群偏差平滑优化的用户分类装置,其特征在于,所述分类装置包括:获取模块,用于获取属于不同客群的样本用户的样本设备数据分别进行训练得到的客群模型;预测安全评分获取模块,用于将各个样本设备数据分别输入各个所述客群模型,分别得到相应的预测安全评分;真实安全评分获取模块,用于根据各个客群中的样本用户的样本设备数据,确定各个客群的真实安全评分;分类权重获取模块,用于根据每个所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分,得到每个所述样本用户在不同所述客群下的分类权重;训练模块,用于根据各个所述样本用户在不同客群下的分类权重和各个所述样本用户所属的客群,训练得到用户分类模型;分类模块,用于根据所述用户分类模型和各个所述客群模型,对待识别用户进行分类,确定所述待识别用户所属客群。
- 根据权利要求7所述的用户分类装置,其特征在于,所述分类权重获取模块被配置为:针对每个样本用户,分别确定所述样本用户在不同所述客群模型下的预测安全评分和对应的所述客群的真实安全评分之间的相对偏差值,作为所述样本用户在不同所述客群下的分类权重;分别得到每个所述样本用户在不同所述客群下的分类权重。
- 根据权利要求8所述的用户分类装置,其特征在于,所述分类权重获取模块被配置为:针对每个所述客群,计算所述样本用户在所述客群对应的客群模型下的预测安全评分和所述客群的真实安全评分的欧式距离,作为所述相对偏差值;分别得到所述样本用户在不同所述客群下的相对偏差值。
- 一种电子设备,其特征在于,包括:一个或多个处理器;存储装置,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的方法。
- 一种计算机可读介质,其上存储有计算机程序,其特征在于,所述程序被处理器执行时实现如权利要求1-6中任一所述的方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210635205.1 | 2022-06-06 | ||
CN202210635205.1A CN114897099A (zh) | 2022-06-06 | 2022-06-06 | 基于客群偏差平滑优化的用户分类方法、装置及电子设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023236588A1 true WO2023236588A1 (zh) | 2023-12-14 |
Family
ID=82728567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/077882 WO2023236588A1 (zh) | 2022-06-06 | 2023-02-23 | 基于客群偏差平滑优化的用户分类方法及装置 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN114897099A (zh) |
WO (1) | WO2023236588A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114897099A (zh) * | 2022-06-06 | 2022-08-12 | 上海淇玥信息技术有限公司 | 基于客群偏差平滑优化的用户分类方法、装置及电子设备 |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190349391A1 (en) * | 2018-05-10 | 2019-11-14 | International Business Machines Corporation | Detection of user behavior deviation from defined user groups |
CN111080123A (zh) * | 2019-12-14 | 2020-04-28 | 支付宝(杭州)信息技术有限公司 | 用户风险评估方法及装置、电子设备、存储介质 |
CN111967910A (zh) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | 一种用户客群分类方法和装置 |
CN112307472A (zh) * | 2020-11-03 | 2021-02-02 | 平安科技(深圳)有限公司 | 基于智能决策的异常用户识别方法、装置及计算机设备 |
CN112950359A (zh) * | 2021-03-30 | 2021-06-11 | 建信金融科技有限责任公司 | 一种用户识别方法和装置 |
CN113254510A (zh) * | 2021-07-06 | 2021-08-13 | 平安科技(深圳)有限公司 | 业务风险客群的识别方法、装置、设备及存储介质 |
CN114897099A (zh) * | 2022-06-06 | 2022-08-12 | 上海淇玥信息技术有限公司 | 基于客群偏差平滑优化的用户分类方法、装置及电子设备 |
-
2022
- 2022-06-06 CN CN202210635205.1A patent/CN114897099A/zh active Pending
-
2023
- 2023-02-23 WO PCT/CN2023/077882 patent/WO2023236588A1/zh unknown
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190349391A1 (en) * | 2018-05-10 | 2019-11-14 | International Business Machines Corporation | Detection of user behavior deviation from defined user groups |
CN111080123A (zh) * | 2019-12-14 | 2020-04-28 | 支付宝(杭州)信息技术有限公司 | 用户风险评估方法及装置、电子设备、存储介质 |
CN111967910A (zh) * | 2020-08-18 | 2020-11-20 | 中国银行股份有限公司 | 一种用户客群分类方法和装置 |
CN112307472A (zh) * | 2020-11-03 | 2021-02-02 | 平安科技(深圳)有限公司 | 基于智能决策的异常用户识别方法、装置及计算机设备 |
CN112950359A (zh) * | 2021-03-30 | 2021-06-11 | 建信金融科技有限责任公司 | 一种用户识别方法和装置 |
CN113254510A (zh) * | 2021-07-06 | 2021-08-13 | 平安科技(深圳)有限公司 | 业务风险客群的识别方法、装置、设备及存储介质 |
CN114897099A (zh) * | 2022-06-06 | 2022-08-12 | 上海淇玥信息技术有限公司 | 基于客群偏差平滑优化的用户分类方法、装置及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
CN114897099A (zh) | 2022-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11501187B2 (en) | Opinion snippet detection for aspect-based sentiment analysis | |
US11023682B2 (en) | Vector representation based on context | |
US11128668B2 (en) | Hybrid network infrastructure management | |
WO2021120677A1 (zh) | 一种仓储模型训练方法、装置、计算机设备及存储介质 | |
WO2021068513A1 (zh) | 异常对象识别方法、装置、介质及电子设备 | |
US11551437B2 (en) | Collaborative information extraction | |
CN111066021A (zh) | 使用随机文档嵌入的文本数据表示学习 | |
US20200227030A1 (en) | Adversarial Training Data Augmentation for Generating Related Responses | |
US9953029B2 (en) | Prediction and optimized prevention of bullying and other counterproductive interactions in live and virtual meeting contexts | |
CN112863683A (zh) | 基于人工智能的病历质控方法、装置、计算机设备及存储介质 | |
US20200349226A1 (en) | Dictionary Expansion Using Neural Language Models | |
CN112925914B (zh) | 数据安全分级方法、系统、设备及存储介质 | |
US10678821B2 (en) | Evaluating theses using tree structures | |
CN113254716B (zh) | 视频片段检索方法、装置、电子设备和可读存储介质 | |
WO2021196935A1 (zh) | 数据校验方法、装置、电子设备和存储介质 | |
US20230092274A1 (en) | Training example generation to create new intents for chatbots | |
WO2023236588A1 (zh) | 基于客群偏差平滑优化的用户分类方法及装置 | |
CN114140947A (zh) | 界面展示方法、装置、电子设备、存储介质和程序产品 | |
WO2022105137A1 (zh) | 案件处理方法、装置、计算机设备和计算机可读存储介质 | |
WO2021184547A1 (zh) | 对话机器人意图语料生成方法、装置、介质及电子设备 | |
CN110782128B (zh) | 一种用户职业标签生成方法、装置和电子设备 | |
US20230169389A1 (en) | Domain adaptation | |
EP4285275A1 (en) | Virtual dialog system performance assessment and enrichment | |
US11361031B2 (en) | Dynamic linguistic assessment and measurement | |
US20220405487A1 (en) | Causal Knowledge Identification and Extraction |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23818756 Country of ref document: EP Kind code of ref document: A1 |