CN115496338A - Electric power payment channel drainage method, system and medium based on big data technology - Google Patents

Electric power payment channel drainage method, system and medium based on big data technology Download PDF

Info

Publication number
CN115496338A
CN115496338A CN202211068628.6A CN202211068628A CN115496338A CN 115496338 A CN115496338 A CN 115496338A CN 202211068628 A CN202211068628 A CN 202211068628A CN 115496338 A CN115496338 A CN 115496338A
Authority
CN
China
Prior art keywords
user
payment
time
electric power
payment channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211068628.6A
Other languages
Chinese (zh)
Inventor
张明杰
朱龙珠
邓志东
杨菁
龚健
宫立华
刘鲲鹏
朱青
王慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Co ltd Customer Service Center
Original Assignee
State Grid Co ltd Customer Service Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Co ltd Customer Service Center filed Critical State Grid Co ltd Customer Service Center
Priority to CN202211068628.6A priority Critical patent/CN115496338A/en
Publication of CN115496338A publication Critical patent/CN115496338A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/08Payment architectures
    • G06Q20/14Payment architectures specially adapted for billing systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Evolutionary Computation (AREA)
  • Educational Administration (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Primary Health Care (AREA)

Abstract

The invention relates to the technical field of electric power big data, in particular to a method, a system and a medium for guiding a power payment channel based on a big data technology. The method provided by the invention provides an electric power payment channel drainage method based on big data technology, and by constructing a drainage combination model, which comprises an electricity purchasing prediction model, a customer subdivision model and an optimal channel recommendation three submodels, the problems of drainage at what time, drainage to who and drainage to which channel are respectively and systematically solved. And potential crowds are classified and are matched and associated with tendency channels in a humanized mode while drainage is carried out, and win-win of enterprise operation improvement and customer experience improvement is achieved.

Description

Electric power payment channel drainage method, system and medium based on big data technology
Technical Field
The invention relates to the technical field of electric power big data, in particular to a method, a system and a medium for guiding a power payment channel based on a big data technology.
Background
Due to the rapid development of current economy, the increasing amount of power consumption and the continuous abundance of power consumption client types, the demand of current power consumption services tends to diversify. For users, it is important to obtain a faster payment mode, and for power supply enterprises, in order to take advantage in the future power market, the control force on payment channels must be strengthened, an online payment mode with lower operation cost is continuously introduced, a payment user group is subdivided, the user characteristics of each channel are deeply researched, potential payment users on the line are identified, and are guided to online payment channels actively through precise drainage of personalized operation strategies, so that the ecology of the power payment channels with good customer experience and low operation cost of power companies is formed.
At present, the application range of the emerging technology based on big data mining in the field of electric big data is increasingly wide, and the technology is also increasingly accurate in data analysis and processing. The application of big data also benefits the power industry, and more power employees apply the big data mining technology to various fields such as power production, power service marketing, power dispatching and the like. However, in terms of drainage of the electric power payment channel, the following disadvantages still exist in the prior art:
(1) The existing channel drainage method is generally a single drainage method after user segmentation is focused, only one of a channel and a user is aimed at, the characteristics of the channel and the user are not noticed, and the channel drainage method combining the channel and the user cannot be realized.
(2) The existing channel drainage method can only solve the problem of matching relation between a user and a channel, and can not provide basis for subsequent drainage actions such as drainage time, drainage measures and the like, so that the drainage effect is not ideal.
Disclosure of Invention
In view of the above, a first object of the present invention is to provide a drainage method for an electric power payment channel based on big data technology, which scientifically and systematically solves the problems of drainage at what time, drainage to whom, and drainage to which channel by constructing a drainage combination model including an electricity purchasing prediction model, a customer segmentation model, and a preferred channel recommendation three sub-models. Potential crowds are classified and matched and associated with the tendency channels while drainage is carried out, and win-win of enterprise operation improvement and customer experience improvement is achieved.
Based on the same inventive concept, the second purpose of the invention is to provide an electric power payment channel drainage system based on big data technology.
Based on the same inventive concept, a third object of the present invention is to provide a storage medium.
The first purpose of the invention can be achieved by the following technical scheme:
a power payment channel drainage method based on big data technology comprises the following steps:
acquiring a user electric power payment data set and preprocessing the user electric power payment data set;
constructing a time convolution network model, and predicting the user payment time according to the user electric power payment data;
according to the user electric power payment data and the prediction result of the time convolution network model, a Gaussian mixture clustering model is built, the initial value of the Gaussian mixture clustering model is obtained through the expectation maximization algorithm iterative computation, the Gaussian mixture clustering model is further trained, and the characteristics of the subdivided user group are obtained;
and calculating the probability that the potential user belongs to each payment channel by utilizing a collaborative filtering algorithm based on the user payment time predicted by a time convolution network and the characteristics of the subdivided user groups obtained by a Gaussian mixed clustering model, and outputting a user channel drainage recommendation result.
Further, a time convolution network model is built, and user payment time is predicted according to user electric power payment data, and the method comprises the following steps:
converting the electricity purchasing amount and electricity purchasing time of the preprocessed user electric power payment data set into a time sequence with the time step length of T and inputting the time sequence;
setting model training parameters, and training a time convolution network model by using time sequence input;
and outputting a prediction sequence by using the trained time convolution network model.
Further, the method converts the electricity purchasing amount and electricity purchasing time of the preprocessed user electricity payment data set into a time sequence with a time step length of T for inputting, and comprises the following steps:
converting the electricity purchasing amount and electricity purchasing time into a second-order matrix;
and setting an offset unit of a time window aiming at the time step length T, dividing the second-order matrix by using the time window, generating a two-dimensional matrix by each division, arranging the two-dimensional matrix according to the moving direction of the time window, and remolding the two-dimensional matrix into a three-dimensional matrix of a sequence overlapping window.
Further, the initial value of the gaussian mixture clustering model is obtained through iterative calculation of an expectation maximization algorithm, and the method comprises the following steps:
initializing parameters of an expected maximum algorithm;
calculating the probability that each data j comes from the sub-model k according to the current parameters of the expected maximum algorithm;
calculating parameters of an expected maximum algorithm of a new iteration;
and repeating iteration until the parameters of the expected maximum algorithm are converged, and obtaining an initial value of the Gaussian mixture clustering model.
Further, initializing the parameters of the expectation maximization algorithm, comprising the following steps:
sorting the values of each column of the data set to obtain a sequence statistic, then calculating the three-point number of the column, dividing each column of the data set into three equal parts, and obtaining the initial three classifications of each column;
calculating a parameter theta of the initial classification mode, and respectively calculating the center distance of each category in each initial classification mode; and selecting the classification with the largest distance as an initial classification, wherein the parameter theta under the classification is the initial value of the algorithm.
Further, by utilizing a collaborative filtering algorithm, based on the user payment time predicted by the time convolution network and the characteristics of the subdivided user groups obtained by the gaussian mixed clustering model, calculating the probability that the potential user belongs to each payment channel specifically comprises:
acquiring user information data, converting the user information into a label vector, and establishing a user-project scoring matrix according to the characteristics of the subdivided user groups;
calculating the similarity of two users according to user information data by using nearest neighbor search, calculating the probability of the two users as alternative similar users through a locality sensitive hashing algorithm, calculating the similar users of all the users, and generating a neighbor user set corresponding to each user;
and predicting the grade of the user on the payment channel by using a collaborative filtering recommendation algorithm based on the user.
Further, the method for calculating the probability that two users are similar users in alternative through a locality sensitive hashing algorithm comprises the following steps:
dividing the label vector of each user information into a plurality of sections, wherein each section has one or more lines of MinHash values, and the probability calculation formula that two users are alternative similar users is as follows:
P=1-(1-s T ) b
wherein, P is the probability of two users being similar users for mutual alternative, b is the number of segments of the label vector of the user information, and T is the number of rows of MinHash values in each segment; s is the abbreviation of sim (u, v), which is the similarity of two users, and the calculation formula is:
Figure BDA0003829119100000031
wherein I is a payment channel set, R ui Scoring the payment channel i by the user u in the user-project scoring matrix; r is vi Scoring the payment channel i by the user v in the user-project scoring matrix;
Figure BDA0003829119100000032
and (4) the average score of the payment channel i in the user-project scoring matrix.
Further, a collaborative filtering recommendation algorithm based on the user is used for predicting the rating of the user on the payment channel, and the calculation formula is as follows:
Figure BDA0003829119100000033
wherein the content of the first and second substances,
Figure BDA0003829119100000041
for the predicted rating of the user u to the payment channel i,
Figure BDA0003829119100000042
average value of user u scoring payment channel, r vi Scoring the payment channel i for user v,
Figure BDA0003829119100000043
the average value of the rating of the payment channel for the user v is shown, S (u, W) is a neighbor user set corresponding to the user u, I i (1) For a collection of users who have paid in a payment channel i, s uv Is the similarity sim (u, v) of user u and user v.
The second purpose of the invention can be achieved by the following technical scheme:
the utility model provides an electric power payment channel drainage system based on big data technology, includes:
the data acquisition module is used for acquiring a user electric power payment data set and carrying out pretreatment;
the time sequence model prediction module is used for constructing a time convolution network model and predicting the user payment time according to the user electric power payment data;
the passenger group behavior subdivision module is used for constructing a Gaussian mixture clustering model according to the user electric power payment data and the prediction result of the time convolution network model, obtaining an initial value of the Gaussian mixture clustering model through the iterative computation of an expectation maximization algorithm, further training the Gaussian mixture clustering model, and obtaining the characteristics of the subdivided user group;
and the channel drainage recommendation module is used for calculating the probability that the potential user belongs to each payment channel by utilizing a collaborative filtering algorithm and based on the user payment time predicted by the time convolution network and the characteristics of the subdivided user groups obtained by the Gaussian mixed clustering model, and outputting a user channel drainage recommendation result.
The third purpose of the invention can be achieved by the following technical scheme:
a storage medium stores a program, and when the program is executed by a processor, the electric power payment channel drainage method based on the big data technology is achieved.
Compared with the prior art, the invention has the following beneficial effects:
(1) The invention adopts a drainage combination model method, scientifically and systematically solves the problems of drainage (When) at what time, drainage (Who) to whom and drainage (Where) to which channel through a time convolution network model, a Gaussian mixture clustering model and a collaborative filtering algorithm, and provides basis for subsequent drainage actions, such as drainage time, drainage measures and the like.
(2) According to the invention, the user group characteristics are associated with the channels by constructing the Gaussian clustering model to subdivide the customer groups and recommending the drainage channels by using the collaborative filtering algorithm, so that the inherent offline payment habits of the old groups such as the old groups can be fully respected while drainage is carried out, the drainage in a 'one-break' manner is avoided, and the win-win situation of enterprise operation improvement and customer experience improvement is realized.
(3) According to the method, the basic data are rich, the algorithm is innovated and optimized, the time convolution network, gaussian cluster analysis, collaborative filtering and other big data methods are adopted, the analyzed and mined data have effectiveness, novelty and usability, and meanwhile, the algorithms such as Gaussian cluster and collaborative filtering algorithms are optimized, so that the scientificity of the model is improved.
(4) According to the method, a big data analysis method is comprehensively utilized, and the electric power payment behaviors of the customers are accurately guided according to the built payment guiding combination model. The method comprises the steps of forecasting the power purchasing behavior cut-in of a customer, finding out the best drainage opportunity, subdividing the customer to find out potential target drainage customers, finally finding out channels matched with different customers through a channel matching model, realizing matching of payment users and target groups, finally improving the power payment service experience of the customer, and reducing the operation cost of an electric power company to a certain extent.
Drawings
FIG. 1 is a flowchart of a method of example 1 of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
Example 1:
the embodiment provides an electric power payment channel drainage method based on a big data technology, which comprises the following steps:
s100, acquiring a user electric power payment data set, and preprocessing, wherein the preprocessing comprises the following steps:
s110, a time window of a system model data cycle is determined, data sources comprise marketing data of a certain power-saving company, payment gateway data, telephone traffic work order data of a national network customer service center and external Internet data, and the marketing data of the certain power-saving company mainly comprises data table information such as user attributes, behavior data, a charging record table, a power utilization address table, a logic network point record table and a payment terminal information table.
S120, preprocessing the data obtained in the step S110 to obtain a processed fusion data set, wherein the preprocessing comprises the following steps:
s121, performing data elimination and data interpolation on the missing value data;
s122, processing partial direct elimination and partial average value repair of the abnormal value;
s123, smoothing the noise data;
the random oversampling method is only to copy a few existing samples, so that a large amount of the same data exists in the positive sample, and the learned classification model generates an overfitting phenomenon. The ROSE algorithm is chosen here to solve this problem.
S200, constructing a Time Convolution Network (TCN) model, processing the fusion data set obtained in the step S120 according to a format input by the model, then training the data input model, predicting the payment time of the user by using the time convolution network model to obtain the details of the user for payment in each time node, and the method comprises the following steps:
s210, converting the electricity purchasing amount and electricity purchasing time of the preprocessed user electricity payment data set into a time sequence with a time step of T, inputting X = [ Xt-T, xt-T +1, …, xt-2 and Xt-1], predicting the time sequence with the time step of T +1 [ Xt-T +1, xt-T +2, …, xt-1 and Xt ], namely outputting a sequence Y = [ Yt-T +1, yt-T +2, …, yt-2 and Yt ], and the method comprises the following steps:
converting the electricity purchasing amount and the electricity purchasing time into an M multiplied by N second-order matrix; setting the offset unit of a time window as 1 aiming at the time step length T, dividing a second-order matrix by using the time window, generating a two-dimensional matrix by each division, arranging the two-dimensional matrix according to the moving direction of the time window, remolding the two-dimensional matrix into a three-dimensional matrix of a sequence overlapping window, and further enabling original data to comprise time sequence data, thereby further improving the time sequence relevance;
s220, setting model training parameters, and training a time convolution network model by using time sequence input;
in this embodiment, the model has four parts, which are an input layer, a full connection layer, a convolution layer and an output layer, and the model parameters are set as follows: the number of the network layers is set to be 16, the training times are set to be 15000, the size of a convolution kernel in the dilation causal convolution is set to be 3, the number of network hidden layer neurons is 60, the learning rate is set to be 5, and a model optimizer uses random gradient descent optimization;
s230, outputting a prediction sequence by using the trained time convolution network model;
in this embodiment, the three-dimensional matrix generated in step S210 is input into the convolutional layer, and as the depth of the network becomes deeper and deeper, the hole coefficient increases with the index, and the output value becomes an input item of the next module after linear superposition, so as to gradually form convolution operation and output a two-dimensional matrix. Finally, carrying out linear dimensionality reduction on the full connection layer, and outputting a prediction sequence with the dimensionality N through an output layer;
s300, based on the payment time and money prediction set output in the step S230 and the user data characteristics, according to the user power payment data and the prediction result of the time convolution network Model, constructing a Gaussian Mixture clustering Model (GMM), iteratively calculating through an expectation maximization algorithm to obtain an initial value of the Gaussian Mixture clustering Model, further training the Gaussian Mixture clustering Model, and obtaining the characteristics of the subdivided user group, wherein the characteristics comprise the following steps:
s310, initializing parameters of the expected maximum algorithm (namely, the Expectation-Maximization algorithm, EM algorithm). Since the model is very sensitive to the initial values when estimating the parameters at the beginning, setting different initial values causes the model to possibly generate the estimation result of the south beam north rut, and therefore, for the EM algorithm, a reasonable initialization method needs to be found.
In this embodiment, a three-place method is used for initialization, and a number sequence is divided into three equal parts by two three places, and the number sequence can be divided into three parts, namely a low value part, a medium value part and a high value part. Based on GMM algorithm, when EM algorithm is used for clustering, an initial classification with larger prior knowledge is obtained through initialization, so that when data obey Gaussian mixed distribution, multidimensional Gaussian distribution can be distinguished as far as possible. The initialization method of the embodiment comprises the following steps:
s311, sorting the values of each column of the data set to obtain an order statistic, then calculating the three-point number of the column, dividing each column of the data set into three equal parts, and obtaining the initial three classifications of each column;
s312, calculating a parameter theta of the initial classification modes, and respectively calculating the center distance of each category in each initial classification mode; selecting the classification with the largest distance as an initial classification, wherein a parameter theta under the classification is an initial value of the algorithm;
in the embodiment, through steps S311 to S312, based on the service-to-payment-customer data definition, the number of gaussian mixture model clusters is 5 to 8, the clustering effect obtained by the algorithm when the number of clusters is 6 is optimal, the customer distribution scatter is less, and the clustering effect is evaluated by using the lander index (RI);
s320, E-step: calculating the possibility that each data j comes from the sub-model k according to the current parameters of the expected maximum algorithm, wherein the calculation formula is as follows:
Figure BDA0003829119100000071
wherein x is j Representing the jth observation data, k is the number of sub-Gaussian models in the mixed model, a k Is the probability that the observed data belongs to the kth sub-model, φ (x) jk ) Is the Gaussian distribution density function of the kth sub-model;
s330, M-step: and calculating parameters of an expected maximum algorithm of a new iteration, wherein the calculation formula is as follows:
Figure BDA0003829119100000072
Figure BDA0003829119100000073
Figure BDA0003829119100000074
and S340, repeating the steps S320-S330, and iterating until the parameter of the expected maximum algorithm is converged to obtain an initial value of the Gaussian mixture clustering model.
And S350, further training the Gaussian mixture clustering model to obtain the characteristics of the subdivided user groups.
In this embodiment, the characteristics of the subdivided user groups are preference information of the users for the channels.
S400, calculating the probability that potential users belong to each payment channel by utilizing a collaborative filtering algorithm based on the user payment time predicted by a time convolution network and the characteristics of the subdivided user groups obtained by a Gaussian mixed clustering model, and outputting a user channel drainage recommendation result, wherein the collaborative filtering algorithm comprises the following steps:
s410, acquiring user information data, converting the user information into a label vector, and establishing a user-project scoring matrix according to the characteristics of the subdivided user group;
in this embodiment, the user information data includes the user power payment data set obtained in step S100 and the preference information of the user for the channel obtained in step S350, and the selection preference of each user is represented by a specific tag vector, where the vector is composed of the user, the item and the evaluation value of the user for the item, and then the information of all users forms a matrix, which is also called a user-item scoring matrix, and the expression is:
Figure BDA0003829119100000081
wherein m represents the number of users in the system, n represents the number of payment channels, and matrix element R ui The value of (d) represents the user's u rating of the payment channel i, R ui The value of (c) is generally in a range of values, usually an integer of 1-5, and the term that the user does not score is replaced with 0. R ui The larger the evaluation result is, the higher the evaluation result of the user u on the payment channel i is, and the lower the evaluation result is.
S420, using Nearest Neighbor Search (NNS), calculating the similarity of the two users according to the user information data, calculating the probability that the two users are alternative similar users through a local sensitive hash algorithm, calculating the similar users of all the users, and generating a neighbor user set corresponding to each user;
the similarity sim (u, v) between the user v and the target user u ranges from-1 to 1, and the closer the value is to 1, the higher the similarity between the users u and v is, otherwise, the lower the similarity is. The similarity between two users (or items) is calculated. There are several ways to calculate the similarity: the Euclidean distance, the cosine similarity or the Jaccard similarity can be directly calculated in a traversal mode when the data dimension is small no matter which calculation mode is adopted. However, as the data dimension increases to some extent, the computational complexity begins to surge.
Therefore, the present embodiment optimizes the similarity distance calculation by a locality sensitive hashing algorithm (LSH). The specific method is to divide each vector into several segments on the basis of the obtained label vector, and each segment is called as a band. The core of the method is as follows: if one or more bands of two vectors are the same, we can consider the two vectors to be very close, the more the bands are the same, the closer the two vectors are. Therefore, the LSH is to perform hash bucket division on each band for the label vector of each user, and users divided into the same bucket on any one band are alternative similar users, so that the similar user group of each user can be found only by calculating the similarity of all the alternative users. Such a user bucket division method still allows users to be presented with the following two situations: two users with very low similarity are hashed into the same bucket, and truly similar users are not hashed into the same bucket on each band. In actual operation, the same hash function can be used for each band, but the hash bucket id needs to be different for each band, so that alternative bucket division can be completed.
In this embodiment, calculating the probability that two users are alternative similar users by using a locality sensitive hashing algorithm specifically includes:
dividing the label vector of each user information into a plurality of sections, wherein each section has one or more lines of MinHash values, and the probability calculation formula that two users are alternative similar users is as follows:
the probability s that all the rows in any one band of two tag vectors have the same value is T
The probability of at least one row having a different value within any one band of two tag vectors is 1-s T
The probability that all bands of the two label vectors are different is (1-s) T )b;
The probability that at least one band of two tag vectors is the same can be expressed as: p =1- (1-s) T ) b I.e. the probability that two users are candidates.
Wherein, P is the probability of two users being similar users for mutual alternative, b is the number of segments of the label vector of the user information, and T is the number of rows of MinHash values in each segment; s is short for sim (u, v), and is the similarity between two users, and the cosine similarity calculation is adopted in this embodiment, and the calculation formula is as follows:
Figure BDA0003829119100000091
wherein I is a payment channel set, R ui Scoring the payment channel i by the user u in the user-project scoring matrix; r vi Scoring the payment channel i by the user v in the user-project scoring matrix;
Figure BDA0003829119100000092
and (4) carrying out average scoring on the payment channel i in the user-project scoring matrix.
And S430, predicting the grade of the user on the payment channel by using a collaborative filtering recommendation algorithm based on the user, and outputting a payment channel drainage recommendation result.
The idea behind the user collaborative filtering algorithm is that a certain customer may like the preferred payment channels of other customers with similar preferences to him. Therefore, when the scoring of the payment channel i by the customer u is predicted, the scoring of the payment channel i by other customers similar to the customer u needs to be added, and the calculation formula is as follows:
Figure BDA0003829119100000093
wherein the content of the first and second substances,
Figure BDA0003829119100000094
for the predicted rating of the user u to the payment channel i,
Figure BDA0003829119100000095
average value of user u scoring payment channel, r vi Scoring the payment channel i for user v,
Figure BDA0003829119100000096
the average value of the rating of the payment channel for the user v is shown, S (u, W) is a neighbor user set corresponding to the user u, I i (1) For a collection of users who have paid in a payment channel i, s uv Is the similarity sim (u, v) of user u and user v.
In conclusion, the invention adopts a drainage combination model method, scientifically and systematically solves the problems of drainage (When) at what time, drainage (Who) to whom and drainage (Where) to which channel through a time convolution network model, a Gaussian mixture clustering model and a collaborative filtering algorithm, and provides a basis for subsequent drainage actions, such as drainage time, drainage measures and the like; according to the method, the user group characteristics are associated with the channels by constructing the Gaussian clustering model to subdivide the customer group and recommending the drainage channels by using the collaborative filtering algorithm, so that the inherent offline payment habits of the old group such as the old group can be fully respected while drainage is performed, the drainage in a 'one-break' manner is avoided, and the win-win situation of enterprise operation improvement and customer experience improvement is realized; according to the method, basic data are rich, the algorithm is innovated and optimized, a time convolution network, gaussian cluster analysis, collaborative filtering and other big data methods are adopted, the analyzed and mined data have effectiveness, novelty and usability, and meanwhile, the algorithms such as Gaussian cluster and collaborative filtering algorithm are optimized, so that the scientificity of the model is improved; according to the method, a big data analysis method is comprehensively utilized, and the electric power payment behaviors of the customers are accurately guided according to the built payment guiding combination model. The method comprises the steps of forecasting the power purchasing behavior cut-in of a customer, finding out the best drainage opportunity, subdividing the customer to find out potential target drainage customers, finally finding out channels matched with different customers through a channel matching model, realizing matching of payment users and target groups, finally improving the power payment service experience of the customer, and reducing the operation cost of an electric power company to a certain extent.
Example 2:
based on the same inventive concept as embodiment 1, this embodiment provides an electric power payment channel drainage system based on big data technology, which includes:
the data acquisition module is used for acquiring a user electric power payment data set and carrying out pretreatment;
the time sequence model prediction module is used for constructing a time convolution network model, predicting the payment time of the user according to the electric power payment data of the user, and obtaining the prepayment user details of each time node;
the passenger group behavior subdivision module is used for constructing a Gaussian mixture clustering model according to the user electric power payment data and the prediction result of the time convolution network model, obtaining an initial value of the Gaussian mixture clustering model through the iterative computation of an expectation maximization algorithm, further training the Gaussian mixture clustering model, and obtaining the characteristics of the subdivided user group;
and the channel drainage recommendation module is used for calculating the probability that the potential user belongs to each payment channel by utilizing a collaborative filtering algorithm and based on the user payment time predicted by the time convolution network and the characteristics of the subdivided user groups obtained by the Gaussian mixed clustering model, and outputting a user channel drainage recommendation result.
That is to say, in the above modules of this embodiment, the data obtaining module is configured to implement step S100 of embodiment 1, the time-series model predicting module is configured to implement step S200 of embodiment 1, the guest group behavior subdividing module is configured to implement step S300 of embodiment 1, and the channel drainage recommending module is configured to implement step S400 of embodiment 1; since steps S100 to S400 have been described in detail in embodiment 1, for brevity of description of the specification, the detailed implementation process of each module in this embodiment is referred to in embodiment 1, and is not described again.
Example 3:
the embodiment provides a storage medium storing a program, and when the program is executed by a processor, the method for draining an electric power payment channel based on a big data technology in embodiment 1 of the present invention is implemented, specifically including:
acquiring a user electric power payment data set and preprocessing the user electric power payment data set;
constructing a time convolution network model, predicting user payment time according to user electric power payment data, and obtaining the user details of prepayment of each time node;
according to the user electric power payment data and the prediction result of the time convolution network model, a Gaussian mixture clustering model is built, the initial value of the Gaussian mixture clustering model is obtained through the expectation maximization algorithm iterative computation, the Gaussian mixture clustering model is further trained, and the characteristics of the subdivided user group are obtained;
and calculating the probability that the potential user belongs to each payment channel by utilizing a collaborative filtering algorithm based on the user payment time predicted by a time convolution network and the characteristics of the subdivided user groups obtained by a Gaussian mixed clustering model, and outputting a user channel drainage recommendation result.
It should be noted that the computer readable storage medium of the present embodiment may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this embodiment, however, a computer readable signal medium may include a propagated data signal with a computer readable program embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. The computer program embodied on the computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The computer-readable storage medium may be written with a computer program for performing the present embodiments in one or more programming languages, including an object oriented programming language such as Java, python, C + +, and conventional procedural programming languages, such as C, or similar programming languages, or combinations thereof. The program may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It is to be understood that the above-described embodiments are only a few embodiments, rather than all, and that the present invention is not limited to the details of the above-described embodiments, as those skilled in the art can make appropriate changes and modifications without departing from the scope of the invention.

Claims (10)

1. A power payment channel drainage method based on big data technology is characterized by comprising the following steps:
acquiring a user electric power payment data set and preprocessing the user electric power payment data set;
constructing a time convolution network model, and predicting the user payment time according to the user electric power payment data;
according to the user electric power payment data and the prediction result of the time convolution network model, a Gaussian mixture clustering model is built, the initial value of the Gaussian mixture clustering model is obtained through the expectation maximization algorithm iterative computation, the Gaussian mixture clustering model is further trained, and the characteristics of the subdivided user group are obtained;
and calculating the probability that the potential user belongs to each payment channel by utilizing a collaborative filtering algorithm based on the user payment time predicted by a time convolution network and the characteristics of the subdivided user groups obtained by a Gaussian mixed clustering model, and outputting a user channel drainage recommendation result.
2. The electric power payment channel drainage method based on big data technology as claimed in claim 1, wherein a time convolution network model is constructed, and the user payment time is predicted according to the user electric power payment data, comprising the following steps:
converting the electricity purchasing amount and electricity purchasing time of the preprocessed user electric power payment data set into a time sequence with the time step length of T and inputting the time sequence;
setting model training parameters, and training a time convolution network model by using time sequence input;
and outputting a prediction sequence by using the trained time convolution network model.
3. The electric power payment channel drainage method based on big data technology as claimed in claim 2, wherein the step of converting the electricity purchasing amount and electricity purchasing time of the preprocessed electric power payment data set of the user into the time sequence input with the time step of T comprises the following steps:
converting the electricity purchasing amount and electricity purchasing time into a second-order matrix; and setting an offset unit of a time window aiming at the time step length T, dividing the second-order matrix by using the time window, generating a two-dimensional matrix by each division, arranging the two-dimensional matrix according to the moving direction of the time window, and remolding the two-dimensional matrix into a three-dimensional matrix of a sequence overlapping window.
4. The electric power payment channel drainage method based on big data technology of claim 1, wherein the initial value of the Gaussian mixture clustering model is obtained by iterative computation of an expectation maximization algorithm, comprising the following steps:
initializing parameters of an expected maximum algorithm;
calculating the probability that each data j comes from the sub-model k according to the current parameters of the expected maximum algorithm;
calculating parameters of an expected maximum algorithm of a new iteration;
and repeating iteration until the parameter of the expected maximum algorithm is converged to obtain an initial value of the Gaussian mixture clustering model.
5. The electric power payment channel drainage method based on big data technology as claimed in claim 4, wherein initializing parameters of expectation maximization algorithm comprises the following steps:
sorting the values of each column of the data set to obtain a sequence statistic, then calculating the three-point number of the column, dividing each column of the data set into three equal parts, and obtaining the initial three classifications of each column;
calculating a parameter theta of the initial classification mode, and respectively calculating the center distance of each category in each initial classification mode; and selecting the classification with the largest distance as an initial classification, wherein the parameter theta under the classification is the initial value of the algorithm.
6. The electric power payment channel drainage method based on big data technology as claimed in claim 1, wherein the calculating the probability that the potential user belongs to each payment channel by using the collaborative filtering algorithm based on the user payment time predicted by the time convolution network and the characteristics of the subdivided user groups obtained by the gaussian mixed clustering model specifically comprises:
acquiring user information data, converting the user information into a tag vector, and establishing a user-project scoring matrix according to the characteristics of the subdivided user group;
calculating the similarity of two users according to user information data by using nearest neighbor search, calculating the probability of the two users as alternative similar users through a locality sensitive hashing algorithm, calculating the similar users of all the users, and generating a neighbor user set corresponding to each user;
and predicting the grade of the user on the payment channel by using a collaborative filtering recommendation algorithm based on the user.
7. The electric power payment channel drainage method based on big data technology as claimed in claim 6, wherein the calculating of the probability of two users being alternative similar users by a locality sensitive hash algorithm specifically comprises:
dividing the label vector of each user information into a plurality of sections, wherein each section has one or more lines of MinHash values, and the probability calculation formula that two users are alternative similar users is as follows:
P=1-(1-s T ) b
wherein, P is the probability of two users being similar users for mutual alternative, b is the number of segments of the label vector of the user information, and T is the number of rows of MinHash values in each segment; s is the abbreviation of sim (u, v), which is the similarity of two users, and the calculation formula is:
Figure FDA0003829119090000021
wherein I is a payment channel set, R ui Scoring the payment channel i by the user u in the user-project scoring matrix; r vi Scoring the payment channel i by the user v in the user-project scoring matrix;
Figure FDA0003829119090000022
and (4) carrying out average scoring on the payment channel i in the user-project scoring matrix.
8. The electric power payment channel drainage method based on big data technology as claimed in claim 7, wherein the user-based collaborative filtering recommendation algorithm is used to predict the user's score of the payment channel, and the calculation formula is:
Figure FDA0003829119090000031
wherein the content of the first and second substances,
Figure FDA0003829119090000032
for the predicted rating of the user u to the payment channel i,
Figure FDA0003829119090000033
average value of rating of payment channel for user u, r vi Scoring the payment channel i for user v,
Figure FDA0003829119090000034
the average value of the rating of the payment channel for the user v is shown, S (u, W) is a neighbor user set corresponding to the user u, I i (1) For the collection of users who have paid in the payment channel i, s uv Is the similarity sim (u, v) of user u and user v.
9. The utility model provides an electric power payment channel drainage system based on big data technology which characterized in that includes:
the data acquisition module is used for acquiring a user electric power payment data set and carrying out pretreatment;
the time sequence model prediction module is used for constructing a time convolution network model and predicting the user payment time according to the user electric power payment data;
the customer base behavior subdivision module is used for constructing a Gaussian mixture clustering model according to the electric power payment data of the user and the prediction result of the time convolution network model, obtaining an initial value of the Gaussian mixture clustering model through iterative calculation of an expectation maximization algorithm, further training the Gaussian mixture clustering model, and obtaining the characteristics of subdivision user bases;
and the channel drainage recommendation module is used for calculating the probability that potential users belong to each payment channel and outputting a user channel drainage recommendation result by utilizing a collaborative filtering algorithm and based on the user payment time predicted by the time convolution network and the characteristics of the subdivided user groups obtained by the Gaussian mixed clustering model.
10. A storage medium storing a program, wherein the program, when executed by a processor, implements the big data technology-based electric power payment channel drainage method according to any one of claims 1 to 8.
CN202211068628.6A 2022-09-02 2022-09-02 Electric power payment channel drainage method, system and medium based on big data technology Pending CN115496338A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211068628.6A CN115496338A (en) 2022-09-02 2022-09-02 Electric power payment channel drainage method, system and medium based on big data technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211068628.6A CN115496338A (en) 2022-09-02 2022-09-02 Electric power payment channel drainage method, system and medium based on big data technology

Publications (1)

Publication Number Publication Date
CN115496338A true CN115496338A (en) 2022-12-20

Family

ID=84467711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211068628.6A Pending CN115496338A (en) 2022-09-02 2022-09-02 Electric power payment channel drainage method, system and medium based on big data technology

Country Status (1)

Country Link
CN (1) CN115496338A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117056591A (en) * 2023-07-24 2023-11-14 深圳义云科技有限公司 Intelligent electric power payment channel recommendation method and system based on dynamic prediction

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117056591A (en) * 2023-07-24 2023-11-14 深圳义云科技有限公司 Intelligent electric power payment channel recommendation method and system based on dynamic prediction

Similar Documents

Publication Publication Date Title
Dhote et al. Hybrid geometric sampling and AdaBoost based deep learning approach for data imbalance in E-commerce
Li et al. Efficiency analysis of machine learning intelligent investment based on K-means algorithm
Nivetha et al. Developing a prediction model for stock analysis
Cao et al. Stock price pattern prediction based on complex network and machine learning
Seret et al. A new SOM-based method for profile generation: Theory and an application in direct marketing
Liu et al. Research model of churn prediction based on customer segmentation and misclassification cost in the context of big data
Li RETRACTED ARTICLE: Optimal design of transportation distance in logistics supply chain model based on data mining algorithm
Shchetinin Cluster-based energy consumption forecasting in smart grids
Raghavendra et al. Artificial humming bird with data science enabled stability prediction model for smart grids
Pereira et al. Towards a predictive approach for omni-channel retailing supply chains
KR20210033294A (en) Automatic manufacturing apparatus for reports, and control method thereof
Zhang et al. Precision Marketing Method of E‐Commerce Platform Based on Clustering Algorithm
Shi et al. Handling uncertainty in financial decision making: a clustering estimation of distribution algorithm with simplified simulation
Ansari et al. Chronos: Learning the language of time series
CN115496338A (en) Electric power payment channel drainage method, system and medium based on big data technology
CN117557299B (en) Marketing planning method and system based on computer assistance
Hsu et al. News-based soft information as a corporate competitive advantage
Kousik et al. An E-Commerce Product Feedback Review Using Sentimental Analysis
Thangamayan et al. Stock price prediction using hybrid deep learning technique for accurate performance
Zhang et al. Combination classification method for customer relationship management
Ozyirmidokuz et al. A data mining based approach to a firm's marketing channel
Sun et al. Determination of temporal stock investment styles via biclustering trading patterns
Zhao Research on e-commerce customer churning modeling and prediction
Chen et al. Business analytics for used car price prediction with statistical models
Tian et al. Digital universal financial credit risk analysis using particle swarm optimization algorithm with structure decision tree learning-based evaluation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination