CN116887434A - International roaming user resource allocation method and device - Google Patents

International roaming user resource allocation method and device Download PDF

Info

Publication number
CN116887434A
CN116887434A CN202210305220.XA CN202210305220A CN116887434A CN 116887434 A CN116887434 A CN 116887434A CN 202210305220 A CN202210305220 A CN 202210305220A CN 116887434 A CN116887434 A CN 116887434A
Authority
CN
China
Prior art keywords
international roaming
clustering
index
indexes
users
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210305220.XA
Other languages
Chinese (zh)
Inventor
范辉
傅晓华
李冰
景昕
杨猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Information Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202210305220.XA priority Critical patent/CN116887434A/en
Publication of CN116887434A publication Critical patent/CN116887434A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Mobile Radio Communication Systems (AREA)

Abstract

The application relates to the field of computers and provides a method and a device for allocating resources of an international roaming user. The method comprises the following steps: determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user; performing resource allocation on the target international roaming user based on the score of the target international roaming user; the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model, and therefore the resource allocation efficiency of the international roaming users is improved.

Description

International roaming user resource allocation method and device
Technical Field
The application relates to the technical field of computers, in particular to a method and a device for allocating resources of an international roaming user.
Background
Currently, the communication demands of the international roaming market are increasingly vigorous, and the more urgent is the telecom operation. When formulating operation, marketing and service strategies for users, how to implement personalized service strategies for different users, so as to realize accurate operation, and reasonably allocate limited resources to users in a better way, thus being a problem to be solved urgently.
Disclosure of Invention
The embodiment of the application provides a method and a device for allocating resources of an international roaming user, which are used for solving the technical problem of low resource allocation efficiency of the international roaming user.
In a first aspect, an embodiment of the present application provides a method for allocating resources of an international roaming user, including:
determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
In one embodiment, before the determining the score of the target international roaming user based on the scoring card and the index values of the plurality of indexes of the target international roaming user, the method further includes:
acquiring international roaming data samples of the plurality of users, wherein each international roaming data sample comprises a plurality of indexes to be selected and index values corresponding to the indexes to be selected;
Dividing the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis;
dividing the indexes to be selected into boxes, calculating the IV value of the information value of each index to be selected after the boxes are divided, screening the indexes to be selected after the boxes are divided based on the IV value, and determining the modeling indexes;
inputting index values corresponding to the modeling indexes into a regression model for training, determining at least one scoring index from the indexes to be selected, and determining regression coefficients of the scoring indexes;
and determining the scoring card based on the regression coefficient of each scoring index.
In one embodiment, the separating the international roaming data samples of the plurality of users into positive and negative samples by cluster analysis comprises:
clustering the international roaming data samples of the plurality of users based on two clustering centers of the ith clustering, and dividing the international roaming data samples of the plurality of users into two categories of the ith clustering, wherein i is a positive integer, and the initial value of i is 1;
respectively calculating sample mean values of international roaming data samples in two categories of the ith clustering to obtain two clustering center points of the (i+1) th clustering;
Stopping clustering under the condition that the two clustering center points of the i+1th clustering are the same as the two clustering centers of the i th clustering; or,
under the condition that the two clustering center points of the (i+1) -th clustering are different from the two clustering centers of the (i) -th clustering, executing 1 adding operation on the (i), and continuing to cluster the international roaming data samples of the plurality of users until the two clustering center points of the (i+1) -th clustering are the same as the two clustering centers of the (i) -th clustering, and stopping clustering;
and determining the positive sample and the negative sample according to two categories obtained by last clustering of the international roaming data samples of the plurality of users.
In one embodiment, the sorting the indexes to be selected, calculating the IV value of the information value of each index to be selected after sorting, screening the indexes to be selected after sorting based on the IV value, and determining the modeling index includes:
dividing the index values of the indexes to be selected into boxes by adopting a method of dividing bit number grouping or clustering analysis to obtain a plurality of box dividing intervals corresponding to the indexes to be selected;
determining the evidence weight WOE value of each index to be selected based on the number of the positive samples and the number of the negative samples corresponding to the plurality of the box sections, and determining the IV value of each index to be selected based on the WOE value;
And determining at least one modulo index from the indexes to be selected based on the IV values of the indexes to be selected.
In one embodiment, the clustering the international roaming data samples of the plurality of users based on the two clustering centers of the ith clustering includes:
calculating the distance between each international roaming data sample and two clustering centers of the ith clustering based on the index value of each index to be selected of each international roaming data sample;
dividing each international roaming data sample into one of two categories of the ith clustering based on the distance between the international roaming data sample and two clustering centers of the ith clustering.
In a second aspect, an embodiment of the present application provides an international roaming user resource allocation apparatus, including:
a scoring module for: determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
a resource allocation module for: performing resource allocation on the target international roaming user based on the score of the target international roaming user;
The scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
In a third aspect, an embodiment of the present application provides an electronic device, including a processor and a memory storing a computer program, where the processor implements the method for allocating resources of an international roaming user according to the first aspect when executing the program.
In a fourth aspect, embodiments of the present application provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the international roaming user resource allocation method according to the first aspect.
In a fifth aspect, an embodiment of the present application provides a computer program product, which comprises a computer program, wherein the computer program when executed by a processor implements the method for allocating resources of an international roaming user according to the first aspect.
According to the international roaming user resource allocation method and device, the cluster analysis is carried out on the international roaming data samples of the users, the regression coefficients obtained by carrying out regression analysis on all indexes of the international roaming data samples through the regression model are used for determining the scoring card, the target international roaming users are comprehensively scored based on the scoring card, and the target international roaming users are allocated based on the scoring of the target international roaming users, so that limited resources can be reasonably allocated to the users in a better mode, and the resource allocation efficiency of the international roaming users is improved.
Drawings
In order to more clearly illustrate the application or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the application, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for allocating resources of an international roaming user provided by the application;
FIG. 2 is a schematic flow chart of a certain scoring card provided by the application;
FIG. 3 is a schematic flow chart of the method for classifying the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis;
FIG. 4 is a schematic flow chart of the method for classifying the indexes to be selected, calculating the IV value of the information value of each classified index, screening the classified indexes based on the IV value, and determining the modeling index;
fig. 5 is a schematic structural diagram of an international roaming user resource allocation device provided by the present application;
fig. 6 is a schematic structural diagram of an electronic device provided by the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Fig. 1 is a schematic flow chart of a method for allocating resources of an international roaming user according to an embodiment of the present application. Referring to fig. 1, an embodiment of the present application provides a method for allocating resources of an international roaming user, which may include: step 100 and step 101.
The method is characterized in that the method is used for accurately operating, the premise of improving the resource allocation efficiency of the international roaming users is that the users are subjected to insight and classification, the user groups are subdivided, and the user values are differentiated.
Because the international roaming market starts later, the roaming data dimension is single, and the borrowable roaming market field big data analysis experience and means are also lacking, most operators mainly use a business rule method to carry out classification evaluation on users, and the method carries out hierarchical division according to the business experience and rules according to the data of the communication consumption, the consumption amount, the product ordering and the like of the users, so that the occupation ratio, the communication behavior preference, the product ordering preference, the consumption capability and the like of the users in different interval hierarchies are observed.
The national diffuse users are used as complex individuals, the characteristics are complex, the roles are various, the relevant user classification technology cannot realize accurate user scoring to a great extent, the adopted business rule analysis method often depends on business personnel experience when selecting scoring indexes and dividing index intervals, and the importance of the indexes and the dividing rules of different levels are judged through artificial experience, so that the subjectivity of the final scoring result is stronger.
In addition, in the related technology, an RFM value analysis method is adopted to classify the user value, the method selects related service indexes according to three most central dimensions, namely recent consumption (precision), consumption Frequency (Frequency) and consumption amount (Monnetary), of the user in the national diffuse communication process, data standardization processing is carried out, a K-Means algorithm is utilized to carry out cluster analysis, a subdivided user group is obtained, and feature analysis is carried out, so that a user value analysis model is obtained.
Although the user subdivision groups obtained by clustering in the RFM value analysis method can qualitatively describe the user characteristics, quantitative evaluation of the users is difficult, and in addition, the integrity is also poor only by using R, F, M three evaluation dimensions and indexes, so that comprehensive evaluation of the users is difficult.
And 100, determining the score of the target international roaming user based on the scoring card and index values of a plurality of indexes of the target international roaming user.
The scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
Optionally, the index card includes a scoring standard of each index under each scoring dimension, that is, a score corresponding to each index value interval.
For example, three scoring dimensions including frequency of consumption, amount of consumption, and ordered product. Each scoring dimension corresponds to at least one scoring indicator, e.g., indicators including travel frequency, length of stay, etc., under the consumption frequency scoring dimension.
The scoring values for the individual scoring dimensions may be calculated according to the following formula:
wherein C is m A score value representing the mth score dimension, k being the number of score indicators corresponding to the mth score dimension, t p And p is a positive integer less than or equal to k, and is the score of the p-th scoring index.
t p The method is determined according to the scoring card, and the corresponding relation between the index value and the score of each index is recorded in the scoring card.
For example, the stay time period is within 24 hours, the corresponding score is 1 minute, the stay time period is within 1 day to 7 days, the corresponding score is 3 minutes, the stay time period is within 7 days to 30 days, the corresponding score is 5 minutes, the stay time period is above 30 days, and the corresponding score is 7 minutes.
And accumulating the scores of the scoring indexes corresponding to the scoring dimensionalities to obtain the scoring values of the scoring dimensionalities.
Determining the score of the target international roaming user according to the score value of each score dimension and the weight of each score dimension by using the following formula:
wherein S is the score of the target international roaming user, M is the total number of score dimensions and P m Weights for the mth scoring dimension, C m And represents the scoring value of the mth scoring dimension.
And 101, allocating resources to the target international roaming user based on the score of the target international roaming user.
The score of the target international roaming user obtained in step 100 may be used to comprehensively evaluate the value of the target international roaming user, so that the allocation of resources may be performed based on the score.
Alternatively, the allocation of the resources may be performed based on the height of the score, for example, the resources may be inclined to users with high scores, or multiple score intervals may be divided, and different service policies may be formulated for users in each score interval.
According to the international roaming user resource allocation method provided by the embodiment of the application, the scoring card is determined by the regression coefficient obtained by carrying out cluster analysis on the international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through the regression model, the target international roaming user is comprehensively scored based on the scoring card, and the target international roaming user is allocated based on the scoring of the target international roaming user, so that limited resources can be reasonably allocated to the users in a better way, and the resource allocation efficiency of the international roaming user is improved.
In some embodiments, before determining the score of the target international roaming user based on the scoring card and the index values of the plurality of indexes of the target international roaming user, the international roaming user resource allocation method further includes: and determining the scoring card.
Fig. 2 is a schematic flow chart of determining a scoring card according to an embodiment of the present application, and as shown in fig. 2, determining a scoring card includes steps 200, 201, 202, 203 and 204.
Step 200, obtaining international roaming data samples of the plurality of users, wherein each international roaming data sample comprises a plurality of indexes to be selected and index values corresponding to each index to be selected.
Optionally, international roaming data samples of multiple users are used to build a scoring system. The international roaming data sample of each user comprises data of a plurality of indexes to be selected, namely index values corresponding to the indexes to be selected.
The plurality of candidate indexes can be a plurality of evaluation indexes corresponding to a plurality of dimensions of the international roaming score index system which are carded out according to the international roaming service operation experience. The evaluation indexes to be selected can comprise 43 evaluation indexes such as trip frequency, stay time, consumption amount, call time, ordered product quantity, opposite terminal number quantity and the like.
In order to ensure the scientificity and accuracy of the scoring cards established later, the acquired user data samples need to accurately reflect the characteristics of the international roaming behaviors of the users, the number of the user data samples is enough, and the number of the evaluation indexes included in each data sample is also enough.
Optionally, data such as national diffuse communication details, outbound signaling positions, product ordering details and the like of national diffuse users in a certain year are converged and integrated into a two-dimensional database table with the mobile phone number of the user as a unique identifier, and the database table contains basic communication consumption indexes of daily travel of the national diffuse users.
The library table can be converted into a press-broad table of each time of the outbound behavior of the national diffusely user according to the travel continuity judging rule of the user. And summarizing the basic indexes on the basis, and selecting 75 quantiles of the annual outbound behavior indexes of the user according to the data corresponding to each index of each data sample to form a national diffuse user annual outbound behavior index broad table.
From the annual outbound behavior index wide table data of the national diffuse users, random hierarchical sampling is carried out, 30 ten thousand data samples of 30 ten thousand national diffuse users are selected, wherein each data sample comprises data of a plurality of evaluation indexes to be selected.
Step 201, separating the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis.
Optionally, the positive samples are international roaming data samples of users who consume relatively high amounts, use relatively high amounts, have relatively large contact circles, and have relatively high frequency of calls. The international roaming data samples of the plurality of users can be divided into positive samples and negative samples by performing cluster analysis on the international roaming data samples of the plurality of users.
Fig. 3 is a schematic flow chart of separating the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis according to an embodiment of the present application.
In some embodiments, as shown in fig. 3, step 201 includes: step 300, step 301, step 302 and step 303.
And 300, clustering the international roaming data samples of the plurality of users based on two clustering centers of the ith clustering, and dividing the international roaming data samples of the plurality of users into two categories of the ith clustering, wherein i is a positive integer, and the initial value of i is 1.
In the case where i is 1, that is, two cluster centers of the 1 st cluster may be international roaming data samples randomly selected from the international roaming data samples of a plurality of users.
In case i is greater than 1, the two cluster centers of the ith cluster are determined from the obtained two classes of international roaming data samples of the ith-1 th cluster.
Optionally, calculating the distance between each international roaming data sample and two clustering centers of the ith clustering based on the index value of each candidate index of each international roaming data sample;
the calculation formula is as follows:
wherein x is 1 Is the cluster center, x j For international roaming data samples, d (x 1 ,x j ) For international roaming data sample x j And cluster center x 1 Distance x of (x) 1k For the cluster center x 1 Index value x of the kth candidate index of (c) jk For international roaming data sample x j The index value d of the kth index to be selected is the number of indexes to be selected.
In the same way, the inter-national roaming data sample x can be calculated j And cluster center x 2 Distance d (x) 2 ,x j )。
Dividing each international roaming data sample into one of two categories of the ith clustering based on the distance between the international roaming data sample and two clustering centers of the ith clustering.
Alternatively, two cluster centers each correspond to a category.
Comparison d (x) 2 ,x j ) And d (x) 1 ,x j ) Is of a size of (a) and (b). At d (x) 2 ,x j ) Greater than d (x) 1 ,x j ) In the case of (a), international roaming data sample x j Will be divided into cluster centers x 2 In the corresponding category, in d (x 2 ,x j ) Less than d (x) 1 ,x j ) In the case of (a), international roaming data sample x j Will be divided into cluster centers x 1 In the corresponding category.
And calculating the distance between each international roaming data sample and two clustering centers, comparing the two distances, distributing each user sample to the nearest clustering center according to the principle of minimum distance, and dividing the international roaming data samples into two categories.
Step 301, calculating sample means of international roaming data samples in two categories of the ith clustering respectively to obtain two clustering center points of the (i+1) th clustering.
And (3) respectively averaging the international roaming data samples in the two categories to serve as new clustering center points, namely two clustering center points for clustering next time.
Step 302, stopping clustering under the condition that two clustering center points of the (i+1) -th clustering are the same as two clustering centers of the (i) -th clustering; or,
and under the condition that the two clustering center points of the (i+1) -th clustering are different from the two clustering centers of the (i) -th clustering, executing the 1-adding operation on the (i), and continuing to cluster the international roaming data samples of the plurality of users until the two clustering center points of the (i+1) -th clustering are the same as the two clustering centers of the (i) -th clustering, and stopping clustering.
Optionally, in the case that the two cluster centers of the (i+1) -th cluster are different from the two cluster centers of the (i) -th cluster, performing the 1-adding operation on the (i), and repeatedly performing the steps 300, 301 and 302 until the cluster centers of the two adjacent clusters are unchanged, where the effect of the cluster is better, and the clustering can be stopped.
And 303, determining the positive sample and the negative sample according to two categories obtained by last clustering of the international roaming data samples of the plurality of users.
Optionally, the two types of international roaming data samples obtained by the last clustering are positive samples and negative samples respectively.
Step 202, the indexes to be selected are classified, the IV value of the information value of each classified index is calculated, the indexes to be selected after the classification are screened based on the IV value, and the modeling indexes are determined.
Fig. 4 is a schematic flow chart of the embodiment of the application, in which the indexes to be selected are sorted into boxes, the IV values of the information values of the indexes to be selected after the boxes are calculated, and the indexes to be selected after the boxes are sorted based on the IV values are screened to determine the modeling indexes.
In some embodiments, as shown in fig. 4, step 202 includes: step 400, step 401 and step 402.
And 400, carrying out box division on the index values of the indexes to be selected by adopting a quantile grouping or clustering analysis method to obtain a plurality of box division intervals corresponding to the indexes to be selected.
Alternatively, the basic idea of grouping the quantiles is to equally divide the samples into several parts, each group containing the same number of samples, and the group limit being the quantile value at the corresponding quantile. For example, the sample may be equally divided into 4 parts, then into points of 25%,50% and 75% in order, at which point the variable values would be set as the group limits of the 4 groups, respectively.
The clustering analysis box-dividing method is based on a K-means clustering box-dividing method, wherein preprocessing of continuous variables is completed firstly, then normalization processing is carried out on preprocessed data, then a K-means clustering algorithm is applied to the data, the data are divided into a plurality of intervals, and the initial centers of the K-means clustering algorithm are adopted for the interval division, so that a clustering center is obtained; after the cluster centers are obtained, the midpoints of adjacent cluster centers are used as classification points, and each object is added into the class closest to the cluster center, so that the data is divided into a plurality of intervals; and then, recalculating each cluster center, and then, repartitioning the data until each cluster center is unchanged, so as to obtain a final cluster result.
The continuous variables may be discretized, i.e., binned, using a quantile or clustering binning method.
Step 401, determining the evidence weight WOE value of each candidate index based on the number of the positive samples and the number of the negative samples corresponding to the plurality of bin intervals, and determining the IV value of each candidate index based on the WOE value.
The binning needs to follow the following basic principles:
the number of the sub-boxes should be moderate, and should not be too much or too little. Too little differentiation is insufficient, too much stability is not strong, and management is inconvenient; the number of records in each sub-box is reasonable, and the number of records in each sub-box is not too large or too small; in combination with the target variable, the bin should exhibit significant trend characteristics; the target variable distribution differences of adjacent bins should be large.
Therefore, to enhance the binning effect, it is determined whether the variables are important to predict the target variable and whether binning is reasonable by calculating evidence weight (Weight of Evidence, WOE) and information values (Information Value, IV) from the target variable.
Based on the number of positive samples and the number of negative samples corresponding to the plurality of box intervals, the evidence weight WOE value of each candidate index is determined by using the following formula:
wherein distrBad a For negative sample duty cycle corresponding to binning interval a, distrGood a And for the positive sample duty ratio corresponding to the box division interval a, WOE is the WOE value of the index to be selected in the box division interval a.
Further, the IV value may be calculated according to the WOE value, where the IV value is calculated as follows:
wherein e is the number of the bin intervals of the index to be selected, and IV is the IV value of the index to be selected.
Step 402, determining at least one modulo index from the candidate indexes based on the magnitude of the IV value of each candidate index.
The IV value may be used to indicate whether the candidate indicator has a significant meaning to the predicted user's score. Variables may be selected according to the following criteria:
when IV <0.02, the candidate index has little help in predicting the user's score.
When IV is more than or equal to 0.02 and less than or equal to 0.1, the index to be selected has certain help to predict the score of the user.
When IV is 0.1< and less than or equal to 0.3, the index to be selected has great help to predict the score of the user.
When IV >0.3, this candidate indicator has a great help in predicting the user's score.
Alternatively, the index to be selected having an IV value greater than 0.1 may be determined as the modulo index.
And 203, inputting index values corresponding to the modeling indexes into a regression model for training, determining at least one scoring index from the multiple indexes to be selected, and determining regression coefficients of the scoring indexes.
Alternatively, the regression model may be a Logistic regression model.
The Logistic regression model is a widely used data analysis technique when the predicted target variable is a discrete variable. The model form of Logistic regression is:
logit(p)=β 01 x 1 +…+β t x t
where p is the probability of occurrence of the result of interest, e.g. probability of occurrence of high frequency travel userA rate; beta 0 、β 1 And beta t Is a constant term of the regression equation; x is x 1 、x t Is an independent variable of the input model, namely, an index value of the modulus-entering index.
In general, when an independent variable of an input model is a continuous variable, the independent variable is converted into the continuous variable by a dummy variable mode and then processed. However, when the score card is constructed, since all the independent variables are converted into discrete variables, if all the discrete variables are converted into dummy variables, much information is lost, because the difference before the values (two bins) of two adjacent independent variables are considered to be the same by using the dummy variables, which obviously does not conform to the practical situation. Alternatively, WOE values corresponding to the variable bins may be used as input variables for Logistic regression, which fully considers the differences between the different bins, while also preserving the trend of the respective variables to the target variable distribution.
In addition to choosing to have all of the arguments enter the model, the model may also be chosen to have the final argument enter the model when using Logistic regression for prediction.
Alternatively, the present embodiment selects a step-by-step method to select the argument into the model. The method is a combination of a forward method and a backward method, the model firstly finds out the independent variable with the strongest predicted force from all independent variables to enter the model, then finds out that the independent variable with the strongest predicted force enters the model, but in the process that the independent variable enters the model, if the predicted ability of one independent variable is found to be weaker, the variable exits the model until an optimal independent variable combination is finally found, namely, the optimal combination of the modeling indexes is found out to serve as a scoring index.
And determining regression coefficients of the scoring indexes after finding out the optimal combination of the modulus indexes.
And 204, determining a scoring card based on the regression coefficients of the scoring indexes.
Alternatively, converting the regression coefficients into the form of behavioral scores is a process of scaling.
In order to be convenient for business personnel to use, and to have business implications for the differences between scores, the scores generated need to meet the following requirements:
The score needs to be controlled within a certain range, for example, between 0 and 1000 minutes; at a certain fraction, the positive and negative samples have a certain proportionality, which is statistically expressed by a specific statistic figure of merit odds, odds=positive sample duty/negative sample duty, for example, it is desirable that the positive and negative sample ratio is 50 at a score value of 500 minutes: 1, a step of; the score value should be increased to better reflect the change in the proportional relationship between the good and bad users, for example, it is desirable that odds be doubled every 50 minutes the score value is increased.
A scoring card formula common to the industry may be employed, and the scoring card formula common to the industry is as follows:
Score=Offset+Factor*ln(odds) (1)
where Score is the Score, offset is the Offset, factor is the coefficient, odds is the figure of merit.
In order to meet the scoring requirements described above, the following equation needs to be satisfied:
Score+pdo=Offset+Factor*Factor*ln(2*odds) (2)
wherein pdo (points to double the odds) represents the score value that needs to be increased in order to increase odds 1-fold.
From equation (1) and equation (2) it can be determined that:
pdo=Factor*ln2
namely:
Factor=pdo/ln2
Offset=Score-Factor*ln(odds)
if the score value is 500 and the figure of merit is 30:1 and the figure of merit odds increases 1-fold for every 50 points of increase in score, then it can be derived from the equation above:
Offset=500-72.1348*ln(odds)
from the Logistic regression equation, logistic (p) =ln (odds), substituting formula (1) yields:
Score=Offset+Factor*(α+∑β j *WOE j )
Wherein WOE is j WOE values, α, β for each bin representing the jth variable j Coefficients representing Logistic regression results.
In the embodiment of the present application, the settings of the offset and the factor are as follows
Factor=20/np.log(2)
offset=800-20*np.log(20)/np.log(2)
And (3) inputting characteristic variables according to the formula (1) to finish issuing the criticizing score card.
According to the international roaming user resource allocation method provided by the embodiment of the application, the scoring card is determined by the regression coefficient obtained by carrying out cluster analysis on the international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through the regression model, the target international roaming user is comprehensively scored based on the scoring card, and the target international roaming user is allocated based on the scoring of the target international roaming user, so that limited resources can be reasonably allocated to the users in a better way, and the resource allocation efficiency of the international roaming user is improved.
In some embodiments, verifying the effect of the model is also included. Firstly checking the fit condition of the Logistic model, and secondly checking the availability of the scoring.
Optionally, the test for the case of Logistic model fitting comprises a significance test of the regression coefficients, a goodness-of-fit test of the regression equation. The purpose of the significance test of the regression coefficients is to test whether each input variable in the equation has a significant linear relationship with LogitP one by one, and to make a significant contribution to explaining LogitP. The test statistic is Wald statistic, and is mathematically defined as:
Fitting goodness of regression equation was checked using NagelkerkeR 2 Statistics and confusion matrix, nagelkerkeR 2 The extent to which the equation interprets the output variables is reflected. The value range is between 0 and 1. The closer to 1, the higher the goodness of fit of the explanatory equation, the closer to 0, the lower the goodness of fit of the explanatory equation. The confusion matrix is a very visual method for evaluating the quality of the model, and shows the coincidence degree of the model predicted value and the actual observed value in a matrix table form.
And (5) checking the scoring result by adopting a K-S index method. The abscissa represents credit score values, and the ordinate represents cumulative percentages, in a descending order. The two curve sub-tables represent the positive and negative sample cumulative duty cycles at the corresponding scoring values. In the case where the model is valid, the negative sample cumulative duty cycle curve should be above the positive sample cumulative duty cycle curve, and the further the two curves are, the better the model is. The more the model is capable of distinguishing between positive and negative samples.
Fig. 5 is a schematic structural diagram of an international roaming user resource allocation apparatus according to an embodiment of the present application, and as shown in fig. 5, an international roaming user resource allocation apparatus 500 includes: a scoring module 510 and a resource allocation module 520.
Scoring module 510 for: determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
a resource allocation module 520 for: performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
Optionally, the international roaming user resource allocation apparatus 500 further comprises a training module, configured to:
acquiring international roaming data samples of the plurality of users, wherein each international roaming data sample comprises a plurality of indexes to be selected and index values corresponding to the indexes to be selected;
dividing the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis;
dividing the indexes to be selected into boxes, calculating the IV value of the information value of each index to be selected after the boxes are divided, screening the indexes to be selected after the boxes are divided based on the IV value, and determining the modeling indexes;
Inputting index values corresponding to the modeling indexes into a regression model for training, determining at least one scoring index from the indexes to be selected, and determining regression coefficients of the scoring indexes;
and determining the scoring card based on the regression coefficient of each scoring index.
Optionally, the classifying the international roaming data samples of the plurality of users into positive samples and negative samples by cluster analysis includes:
clustering the international roaming data samples of the plurality of users based on two clustering centers of the ith clustering, and dividing the international roaming data samples of the plurality of users into two categories of the ith clustering, wherein i is a positive integer, and the initial value of i is 1;
respectively calculating sample mean values of international roaming data samples in two categories of the ith clustering to obtain two clustering center points of the (i+1) th clustering;
stopping clustering under the condition that the two clustering center points of the i+1th clustering are the same as the two clustering centers of the i th clustering; or,
under the condition that the two clustering center points of the (i+1) -th clustering are different from the two clustering centers of the (i) -th clustering, executing 1 adding operation on the (i), and continuing to cluster the international roaming data samples of the plurality of users until the two clustering center points of the (i+1) -th clustering are the same as the two clustering centers of the (i) -th clustering, and stopping clustering;
And determining the positive sample and the negative sample according to two categories obtained by last clustering of the international roaming data samples of the plurality of users.
Optionally, the classifying the indexes to be selected, calculating IV values of information values of the indexes to be selected after classifying, screening the indexes to be selected after classifying based on the IV values, and determining the modeling indexes includes:
dividing the index values of the indexes to be selected into boxes by adopting a method of dividing bit number grouping or clustering analysis to obtain a plurality of box dividing intervals corresponding to the indexes to be selected;
determining the evidence weight WOE value of each index to be selected based on the number of the positive samples and the number of the negative samples corresponding to the plurality of the box sections, and determining the IV value of each index to be selected based on the WOE value;
and determining at least one modulo index from the indexes to be selected based on the IV values of the indexes to be selected.
Optionally, the clustering the international roaming data samples of the plurality of users based on the two clustering centers of the ith clustering includes:
Calculating the distance between each international roaming data sample and two clustering centers of the ith clustering based on the index value of each index to be selected of each international roaming data sample;
dividing each international roaming data sample into one of two categories of the ith clustering based on the distance between the international roaming data sample and two clustering centers of the ith clustering.
It should be noted that, the above device provided in the embodiment of the present application can implement all the method steps implemented in the method embodiment and achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those in the method embodiment in this embodiment are omitted.
Fig. 6 illustrates a physical schematic diagram of an electronic device, as shown in fig. 6, which may include: processor 610, communication interface (Communication Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. The processor 610 may call a computer program in the memory 630 to perform the steps of the work order matching method, for example, including:
Determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present application further provide a computer program product, where the computer program product includes a computer program, where the computer program may be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor is capable of executing the steps of the international roaming user resource allocation method provided in the foregoing embodiments, for example, including:
determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
In another aspect, embodiments of the present application further provide a processor-readable storage medium storing a computer program for causing a processor to execute the steps of the method provided in the above embodiments, for example, including:
Determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
The processor-readable storage medium may be any available medium or data storage device that can be accessed by a processor, including, but not limited to, magnetic storage (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical storage (e.g., CD, DVD, BD, HVD, etc.), semiconductor storage (e.g., ROM, EPROM, EEPROM, nonvolatile storage (NAND FLASH), solid State Disk (SSD)), and the like.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. An international roaming user resource allocation method, comprising:
determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
2. The international roaming user resource allocation method of claim 1, wherein prior to the determining the target international roaming user score based on the scoring card and the index values of the plurality of indexes of the target international roaming user, the method further comprises:
acquiring international roaming data samples of the plurality of users, wherein each international roaming data sample comprises a plurality of indexes to be selected and index values corresponding to the indexes to be selected;
Dividing the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis;
dividing the indexes to be selected into boxes, calculating the IV value of the information value of each index to be selected after the boxes are divided, screening the indexes to be selected after the boxes are divided based on the IV value, and determining the modeling indexes;
inputting index values corresponding to the modeling indexes into a regression model for training, determining at least one scoring index from the indexes to be selected, and determining regression coefficients of the scoring indexes;
and determining the scoring card based on the regression coefficient of each scoring index.
3. The method for allocating resources of international roaming users according to claim 2, wherein the separating the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis comprises:
clustering the international roaming data samples of the plurality of users based on two clustering centers of the ith clustering, and dividing the international roaming data samples of the plurality of users into two categories of the ith clustering, wherein i is a positive integer, and the initial value of i is 1;
respectively calculating sample mean values of international roaming data samples in two categories of the ith clustering to obtain two clustering center points of the (i+1) th clustering;
Stopping clustering under the condition that the two clustering center points of the i+1th clustering are the same as the two clustering centers of the i th clustering; or,
under the condition that the two clustering center points of the (i+1) -th clustering are different from the two clustering centers of the (i) -th clustering, executing 1 adding operation on the (i), and continuing to cluster the international roaming data samples of the plurality of users until the two clustering center points of the (i+1) -th clustering are the same as the two clustering centers of the (i) -th clustering, and stopping clustering;
and determining the positive sample and the negative sample according to two categories obtained by last clustering of the international roaming data samples of the plurality of users.
4. The method for allocating resources of an international roaming user according to claim 2, wherein the classifying the indexes into bins, calculating an IV value of the information value of each classified index, screening the classified indexes based on the IV value, and determining the modeling index includes:
dividing the index values of the indexes to be selected into boxes by adopting a method of dividing bit number grouping or clustering analysis to obtain a plurality of box dividing intervals corresponding to the indexes to be selected;
Determining the evidence weight WOE value of each index to be selected based on the number of the positive samples and the number of the negative samples corresponding to the plurality of the box sections, and determining the IV value of each index to be selected based on the WOE value;
and determining at least one modulo index from the indexes to be selected based on the IV values of the indexes to be selected.
5. The international roaming user resource allocation method of claim 3, wherein the clustering of the international roaming data samples of the plurality of users based on the two clustering centers of the ith cluster, the classifying the international roaming data samples of the plurality of users into the two categories of the ith cluster comprises:
calculating the distance between each international roaming data sample and two clustering centers of the ith clustering based on the index value of each index to be selected of each international roaming data sample;
dividing each international roaming data sample into one of two categories of the ith clustering based on the distance between the international roaming data sample and two clustering centers of the ith clustering.
6. An international roaming user resource allocation apparatus, comprising:
A scoring module for: determining a score of a target international roaming user based on a scoring card and index values of a plurality of indexes of the target international roaming user;
a resource allocation module for: performing resource allocation on the target international roaming user based on the score of the target international roaming user;
the scoring card comprises scores corresponding to index value intervals of each index, the scoring card is determined based on regression coefficients, and the regression coefficients are obtained by carrying out cluster analysis on international roaming data samples of a plurality of users and carrying out regression analysis on each index of the international roaming data samples through a regression model.
7. The international roaming user resource allocation apparatus of claim 6, further comprising a training module configured to:
acquiring international roaming data samples of the plurality of users, wherein each international roaming data sample comprises a plurality of indexes to be selected and index values corresponding to the indexes to be selected;
dividing the international roaming data samples of the plurality of users into positive samples and negative samples through cluster analysis;
dividing the indexes to be selected into boxes, calculating the IV value of the information value of each index to be selected after the boxes are divided, screening the indexes to be selected after the boxes are divided based on the IV value, and determining the modeling indexes;
Inputting index values corresponding to the modeling indexes into a regression model for training, determining at least one scoring index from the indexes to be selected, and determining regression coefficients of the scoring indexes;
and determining the scoring card based on the regression coefficient of each scoring index.
8. An electronic device comprising a processor and a memory storing a computer program, characterized in that the processor implements the international roaming user resource allocation method of any one of claims 1 to 5 when executing the computer program.
9. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the international roaming user resource allocation method according to any of claims 1 to 5.
10. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the international roaming user resource allocation method of any one of claims 1 to 5.
CN202210305220.XA 2022-03-25 2022-03-25 International roaming user resource allocation method and device Pending CN116887434A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210305220.XA CN116887434A (en) 2022-03-25 2022-03-25 International roaming user resource allocation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210305220.XA CN116887434A (en) 2022-03-25 2022-03-25 International roaming user resource allocation method and device

Publications (1)

Publication Number Publication Date
CN116887434A true CN116887434A (en) 2023-10-13

Family

ID=88257353

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210305220.XA Pending CN116887434A (en) 2022-03-25 2022-03-25 International roaming user resource allocation method and device

Country Status (1)

Country Link
CN (1) CN116887434A (en)

Similar Documents

Publication Publication Date Title
CN107248082B (en) Card maintenance identification method and device
CN111523996A (en) Approval method and system
CN106874943A (en) Business object sorting technique and system
CN110930218A (en) Method and device for identifying fraudulent customer and electronic equipment
CN112101807A (en) Method and related device for comprehensively evaluating customer value of group in telecommunication industry
CN115879829A (en) Evaluation expert screening method applied to platform innovation capability examination and verification
CN114020650B (en) Crowd test task allocation method and device, electronic equipment and storage medium
CN114782123A (en) Credit assessment method and system
CN113920366A (en) Comprehensive weighted main data identification method based on machine learning
CN113919932A (en) Client scoring deviation detection method based on loan application scoring model
CN115689407A (en) Account abnormity detection method and device and terminal equipment
CN111160929B (en) Method and device for determining client type
CN107274043B (en) Quality evaluation method and device of prediction model and electronic equipment
CN116887434A (en) International roaming user resource allocation method and device
CN115496528A (en) Customer loss prediction method and device, processor and electronic equipment
CN115694975A (en) Network security situation assessment method, electronic equipment and storage medium
CN114625781A (en) Commodity housing value-based batch evaluation method
CN113537759A (en) User experience measurement model based on weight self-adaptation
CN112613920A (en) Loss probability prediction method and device
CN114881677A (en) User demand analysis method, device and equipment
CN112651572A (en) Profit prediction method and apparatus
CN111858639A (en) External data management system and method for wind control management
CN111768130B (en) User allocation method, device, electronic equipment and readable storage medium
CN112926816B (en) Vendor evaluation method, device, computer device and storage medium
CN117522419B (en) Resource allocation method applied to customer relationship management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination