CN111913940A - Temperature member label prediction method and device, electronic equipment and storage medium - Google Patents
Temperature member label prediction method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN111913940A CN111913940A CN202010569862.1A CN202010569862A CN111913940A CN 111913940 A CN111913940 A CN 111913940A CN 202010569862 A CN202010569862 A CN 202010569862A CN 111913940 A CN111913940 A CN 111913940A
- Authority
- CN
- China
- Prior art keywords
- data
- historical consumption
- model
- label
- historical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000006399 behavior Effects 0.000 claims abstract description 18
- 238000012549 training Methods 0.000 claims abstract description 12
- 230000002159 abnormal effect Effects 0.000 claims abstract description 10
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 8
- 239000003814 drug Substances 0.000 claims description 20
- 238000004590 computer program Methods 0.000 claims description 10
- 238000000605 extraction Methods 0.000 claims description 6
- 238000002790 cross-validation Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000007619 statistical method Methods 0.000 claims description 4
- 230000004931 aggregating effect Effects 0.000 claims description 2
- 230000008859 change Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2462—Approximate or statistical queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0207—Discounts or incentives, e.g. coupons or rebates
- G06Q30/0224—Discounts or incentives, e.g. coupons or rebates based on user history
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- General Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Databases & Information Systems (AREA)
- Finance (AREA)
- Game Theory and Decision Science (AREA)
- Human Resources & Organizations (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Quality & Reliability (AREA)
- Probability & Statistics with Applications (AREA)
- General Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Operations Research (AREA)
- Tourism & Hospitality (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a temperature member label prediction method, a temperature member label prediction device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring member tag data and member historical consumption data from a big data platform data source; extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members; removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method; constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters; after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model. By the scheme, the problem that the existing temperature member label definition is inaccurate is solved, accurate definition of the member label can be realized, and consumption behaviors and values of the temperature member can be monitored and predicted in real time.
Description
Technical Field
The invention relates to the field of medical big data, in particular to a temperature member label prediction method and device, electronic equipment and a storage medium.
Background
The quantity of members in the general medicine enterprise industry is large, the mobility is strong, thousands of members buy required medicines every day, and the real consumption willingness and consumption capacity of the members change all the time. For a manager of a medicine enterprise, a certain management standard is needed to monitor the index of real-time change of mass members in a store, namely member temperature, so that the conditions of the members in the store can be known in real time based on the member temperature, and the manager can make an accurate operation decision conveniently.
The temperature member label can be used for conveniently classifying the clients and determining the client value. At present, the definition of the temperature label of the member is often defined directly from the perspective of commercial profit, such as direct reaction of unit price or gross profit, and the temperature label defined in this way can only reflect the historical or current profit value of the member, while the accuracy of the defined temperature label is not high for the future member value.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a temperature membership label prediction method, an apparatus, an electronic device and a storage medium, so as to solve the problem of inaccurate definition of a temperature membership label in the prior art.
In a first aspect of an embodiment of the present invention, a method for predicting a temperature member tag is provided, including:
acquiring member tag data and member historical consumption data from a big data platform data source;
extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method;
constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters;
after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model.
In a second aspect of an embodiment of the present invention, there is provided an apparatus for temperature membership tag prediction, including:
the acquisition module is used for acquiring member tag data and member historical consumption data from a big data platform data source;
the extraction module is used for extracting the historical consumption characteristics of the members and the corresponding predicted tag data according to the member tag data and the historical consumption data of the members;
the clearing module is used for clearing abnormal feature data based on a 3 sigma rule and selecting features through an embedded method;
the training module is used for constructing a ligatgbM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatgbMbM model and determining model parameters;
and the parameter adjusting module is used for substituting the optimal parameters into the ligatghgbm model after parameter adjustment through the GridSearch grid so as to predict the member label based on the ligatgbm model.
In a third aspect of the embodiments of the present invention, there is provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable by the processor, where the processor executes the computer program to implement the steps of the method according to the first aspect of the embodiments of the present invention.
In a fourth aspect of the embodiments of the present invention, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the steps of the method provided by the first aspect of the embodiments of the present invention.
In the embodiment of the invention, the behavior characteristics and the historical predicted values are extracted based on the historical consumption data of the members, after abnormal points of the historical characteristics are eliminated and the characteristics are selected, the ligagmb model is trained, parameters are adjusted through the GridSearch grid, and the member labels are predicted through the ligagmb model corresponding to the optimal parameters, so that the accurate depiction of the temperature members can be realized, the member behaviors can be conveniently predicted in real time, the member value maximization is realized by adjusting the marketing strategy, and the problem of low accuracy of the temperature member labels is solved. Compared with the traditional method of reflecting the user activity by indexes such as the average gross interest of a medicine enterprise, the method has the advantages that the carving granularity is more delicate and accurate, the defined label can directly and comprehensively reflect the client value, the label can be dynamically adjusted in real time according to the real-time behavior of the user, the purpose of dynamically monitoring the member behavior is achieved, and the practical value is higher.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for predicting a temperature membership label according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a temperature membership tag prediction device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons skilled in the art without any inventive work shall fall within the protection scope of the present invention, and the principle and features of the present invention shall be described below with reference to the accompanying drawings.
The terms "comprises" and "comprising," when used in this specification and claims, and in the accompanying drawings and figures, are intended to cover non-exclusive inclusions, such that a process, method or system, or apparatus that comprises a list of steps or elements is not limited to the listed steps or elements.
The definition of the current member temperature is directly reflected from the commercial profit perspective, for example, the commercial profit sales volume of a medicine enterprise is directly reflected by the unit price or gross profit amount of a customer, the commercial value brought by each member cannot be reflected in the mode, the member temperature is mostly based on the financial statement published in the past, the future or real-time dynamic change of the member cannot be reflected, the future operation decision and direction help of a manager is limited, and meanwhile, the member label does not reflect the consumption behavior of the member. Therefore, the definition of the temperature member should integrate the historical consumption behavior of the member and reflect the value of the member in real time.
Referring to fig. 1, a flow chart of a temperature membership label prediction method according to an embodiment of the present invention includes:
s101, acquiring member tag data and member historical consumption data from a big data platform data source;
the big data platform data source may be medicine sales data and member data stored in a server of a data platform such as a related medicine website or an APP. After the member purchases the medicine, the consumption information of the member can be input into the platform, and the sales data can be conveniently analyzed based on the input and storage of mass sales data.
The member tag data generally includes a member card number, points, last consumption time, member consumption level, and the like, and may further include member personal information such as name, gender, age, occupation, place of birth, place of residence, family members, and work units.
The member historical consumption data may be member consumption data of a time period before the current time, such as one month, six months, one year, and the like, which is not limited herein. The member historical consumption data can comprise consumption time, money, medicine names, grades, places and the like, the member consumption behaviors can be analyzed and predicted based on the member historical consumption behavior characteristics, and then the member value is evaluated
S102, extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
the predictive tag data generally refers to a value tag of a member, and can be a specific calculated value for measuring the value of the temperature member.
In one embodiment, the member historical consumption data is divided into a first section of historical consumption data and a second section of historical consumption data; and performing statistical analysis on the first segmented historical consumption data through the Spark cluster to obtain the historical consumption behavior characteristics of the members and generate multi-dimensional user tag data. It can be understood that the first segment historical consumption data is used for extracting user consumption behavior characteristics, the second segment historical consumption data is used for extracting a value label of a member, and the member value label prediction can be performed through the model by training the model according to the first segment historical consumption data and the second segment historical consumption data.
Illustratively, taking historical consumption data of 12 months as an example, the statistical characteristics are aggregated according to members for 1-11 months before Spark cluster analysis, and the statistical characteristics comprise: the last consumption is far from the current time, the total frequency of user consumption, the average amount of each consumption of the user, and according to the time difference, the average consumption times of the user to 64 categories, the frequency interval of medicine purchase of the user, the time interval and the amount interval, the maximum, minimum, average and standard deviation purchase frequency, the maximum, minimum, average and standard deviation purchase amount, the maximum, teq minimum, average and standard deviation purchase time and other statistical characteristics are respectively counted. Based on the statistical characterization describing the user behavior, multi-dimensional user tag data may be generated.
Preferably, the medicine purchasing habits of the users are subjected to feature extraction through word2vec, and medicine purchasing commodities are aggregated according to the time dimension of member medicine purchasing. And vectorizing and expressing commodity text of medicine purchasing through the word2vec so as to acquire the behavior characteristics of a member purchasing a specific commodity.
In one embodiment, calculating the unit price, the total consumption amount, the consumption times and the gross profit of customers corresponding to the historical consumption data of the second section of users; and taking the unit price of the passenger, the total consumption amount, the consumption times and gross profit as characteristic factors, weighting the characteristic factors according to time attenuation factors to obtain a label value, wherein the label value is a member label prediction result and is used for measuring the member value.
S103, removing abnormal feature data based on a 3 sigma rule, and selecting features through an embedded method;
the 3 sigma rule refers to that interference or noise of singular data is difficult to meet normal distribution caused by the fact that the interference or the noise of the singular data is established on the basis of equal-precision repeated measurement of the normal distribution, if the absolute value vi of a residual error of a certain measured value in a group of measured data is larger than 3 sigma, the measured value is a bad value and should be eliminated, and the error which is equal to +/-3 sigma can be generally used as a limit error. And eliminating the abnormal points by a 3 sigma rule (P (| x-u | >3 sigma) < ═ 0.003), reducing the interference of abnormal characteristic data and ensuring the accuracy of characteristic selection.
When the feature data is sampled, the ratio of positive and negative samples can be 1: 5.
For the constructed tree prediction model, because the tree model is insensitive to normalization and discretization, default value supplement and log data transformation are carried out on the tree model, and the completeness and reliability of characteristic data are ensured.
For some labels and features, the distribution does not necessarily conform to the normal distribution, but in the actual operation process, data can conform to the normal distribution, so that log change needs to be carried out on the features, and the data can conform to the normal distribution to a certain extent.
The embedded method is embedded selection, and features are selected by learning the contribution of each feature to the model accuracy. An embedded method is constructed for feature selection, the features of the first 90% of the weight proportion can be reserved, and then the model is trained based on the features. The embedded method can effectively reduce the load pressure of the server for processing mass data.
S104, constructing a ligatghGBM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatghGMB model, and determining model parameters;
the ligatghgbm algorithm grows the tree through a leaf-wise strategy, and the leaf with the largest splitting gain is found from all the current leaves every time, so that the error can be effectively reduced, and the overfitting of the model is avoided.
Preferably, for the characteristic data after washing, the ratio of 8: and 2, dividing the ratio into a training set and a verification set, and obtaining the optimal iteration times best _ n _ estimators under the condition of ensuring the running speed by utilizing a lightgbm native interface lgb.train () under the condition that the learning rate is higher than 0.1.
And S105, after parameter adjustment is carried out through the GridSearch grid, substituting the optimal parameters into the ligagGBM model, and predicting the member label based on the ligagGBM model.
The GridSearch is a parameter adjusting method, and the parameter with the best performance is selected as a result through exhaustive search of candidate parameters. Specifically, the AUC (Area Under the dark) is used as a model evaluation index, k-fold cross validation is adopted to calculate the optimal parameters of the ligatgbM model, the learning rate of the ligatgbM model is reduced, and the optimal iteration times are obtained.
Illustratively, auc is used as an evaluation index to sequentially optimize (1) a maximum depth max _ depth, a leaf node number num _ leave (2), a leaf node minimum sample number min _ child _ samples, a leaf node minimum sample number weight min _ child _ weight (3), a sample sampling proportion bagging _ fraction, a feature sampling proportion feature _ fraction (4), a regularization parameter L1 reg _ alpha and a regularization parameter L2 reg _ lambda, and then the learning rate is reduced to obtain an optimal iteration number best _ n _ indicators by adopting a grid search 5-fold cross validation method.
And substituting the result of the adjustment parameter into a lightgbm model, then storing the model, and predicting the member label through the lightgbm model.
By the method provided by the embodiment, the granularity of depicting the member temperature is more detailed and accurate, the activity degree of the user in a future period of time (such as one month) is simulated in real time based on the characteristics, the time series characteristics, the sales statistical characteristics and the like of the member, the index of the activity degree can be dynamically adjusted in real time along with the real-time consumption behavior of the user, the purpose of dynamically monitoring the member behavior is achieved, and the maximization of the member value can be further realized through a marketing strategy for the member with higher temperature index.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 2 is a schematic structural diagram of a temperature membership tag prediction apparatus according to an embodiment of the present invention, the apparatus including:
the obtaining module 210 is used for obtaining member tag data and member historical consumption data from a big data platform data source;
the extracting module 220 is configured to extract the member historical consumption characteristics and the corresponding predicted tag data according to the member tag data and the member historical consumption data;
specifically, the extracting module 220 includes:
the dividing unit is used for dividing the member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and the counting unit is used for carrying out statistical analysis on the first subsection historical consumption data through the Spark cluster, acquiring the historical consumption behavior characteristics of the member and generating the multi-dimensional user tag data.
Optionally, the extracting module 220 further includes:
and the aggregation unit is used for performing characteristic extraction on the medicine purchasing habits of the users through word2vec and aggregating the medicine purchasing commodities according to the time dimension of member medicine purchasing.
Further, the dividing unit further includes:
the calculating unit is used for calculating the unit price, the total consumption amount, the consumption times and the gross profit of the customers corresponding to the historical consumption data of the second section of users;
and the calculating unit is used for weighting the characteristic factors according to the time attenuation factors by taking the unit price of the passenger, the total consumption amount, the consumption times and the gross profit as the characteristic factors to calculate the label value, wherein the label value is a member label prediction result and is used for measuring the member value.
The clearing module 230 is used for clearing the abnormal feature data based on the 3 sigma rule and selecting features through an embedded method;
the training module 240 is used for constructing a ligatgmb algorithm, selecting cleaned feature data with a predetermined proportion as training data to train the ligatgmb model, and determining model parameters;
and a parameter adjusting module 250, configured to substitute the optimal parameter into the ligatghgbm model after adjusting the parameters through the GridSearch grid, so as to perform member tag prediction based on the ligatgbm model.
Optionally, the adjusting parameters through the GridSearch grid includes:
calculating the optimal parameters of the ligathGBM model by using the AUC as a model evaluation index and adopting k-fold cross validation; and reducing the learning rate of the ligathGBM model and obtaining the optimal iteration times.
In one embodiment of the present invention, an electronic device for temperature membership tag prediction is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps S101 to S105 as in embodiments of the present invention when executing the computer program.
There is also provided in an embodiment of the present invention a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the temperature membership tag prediction method provided in the above embodiment, the non-transitory computer readable storage medium including: ROM/RAM, magnetic disk, optical disk, etc.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.
Claims (10)
1. A method for predicting a temperature membership label, comprising:
acquiring member tag data and member historical consumption data from a big data platform data source;
extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method;
constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters;
after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model.
2. The method of claim 1, wherein extracting the member historical consumption characteristics and the corresponding predictive tag data according to the member tag data and the member historical consumption data comprises:
dividing member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and performing statistical analysis on the first segmented historical consumption data through the Spark cluster to obtain the historical consumption behavior characteristics of the members and generate multi-dimensional user tag data.
3. The method of claim 1, wherein extracting the member historical consumption characteristics and the corresponding predictive tag data according to the member tag data and the member historical consumption data further comprises:
and performing characteristic extraction on the medicine purchasing habits of the users through word2vec, and aggregating medicine purchasing commodities according to the time dimension of member medicine purchasing.
4. The method of claim 2, wherein the dividing the member historical consumption data into a first segmented historical consumption data and a second segmented historical consumption data further comprises:
calculating the unit price, total consumption amount, consumption times and gross profit of the customers corresponding to the historical consumption data of the second section of users;
and taking the unit price of the passenger, the total consumption amount, the consumption times and gross profit as characteristic factors, weighting the characteristic factors according to time attenuation factors to obtain a label value, wherein the label value is a member label prediction result and is used for measuring the member value.
5. The method of claim 1, wherein the tuning the parameters through the GridSearch trellis comprises:
calculating the optimal parameters of the ligathGBM model by using the AUC as a model evaluation index and adopting k-fold cross validation;
and reducing the learning rate of the ligathGBM model and obtaining the optimal iteration times.
6. An apparatus for temperature member tag prediction, comprising:
the acquisition module is used for acquiring member tag data and member historical consumption data from a big data platform data source;
the extraction module is used for extracting the historical consumption characteristics of the members and the corresponding predicted tag data according to the member tag data and the historical consumption data of the members;
the clearing module is used for clearing abnormal feature data based on a 3 sigma rule and selecting features through an embedded method;
the training module is used for constructing a ligatgbM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatgbMbM model and determining model parameters;
and the parameter adjusting module is used for substituting the optimal parameters into the ligatghgbm model after parameter adjustment through the GridSearch grid so as to predict the member label based on the ligatgbm model.
7. The apparatus of claim 6, wherein the extraction module comprises:
the dividing unit is used for dividing the member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and the counting unit is used for carrying out statistical analysis on the first subsection historical consumption data through the Spark cluster, acquiring the historical consumption behavior characteristics of the member and generating the multi-dimensional user tag data.
8. The apparatus of claim 7, wherein the dividing unit further comprises:
the calculating unit is used for calculating the unit price, the total consumption amount, the consumption times and the gross profit of the customers corresponding to the historical consumption data of the second section of users;
and the calculating unit is used for weighting the characteristic factors according to the time attenuation factors by taking the unit price of the passenger, the total consumption amount, the consumption times and the gross profit as the characteristic factors to calculate the label value, wherein the label value is a member label prediction result and is used for measuring the member value.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method for temperature membership tag prediction according to any of claims 1 to 5.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for temperature membership tag prediction according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010569862.1A CN111913940B (en) | 2020-06-20 | 2020-06-20 | Temperature membership tag prediction method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010569862.1A CN111913940B (en) | 2020-06-20 | 2020-06-20 | Temperature membership tag prediction method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111913940A true CN111913940A (en) | 2020-11-10 |
CN111913940B CN111913940B (en) | 2024-04-26 |
Family
ID=73226101
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010569862.1A Active CN111913940B (en) | 2020-06-20 | 2020-06-20 | Temperature membership tag prediction method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111913940B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437199A (en) * | 2017-06-16 | 2017-12-05 | 北京小度信息科技有限公司 | Platform earnings forecast method and device |
CN107665448A (en) * | 2017-09-29 | 2018-02-06 | 北京京东尚科信息技术有限公司 | For determining the method, apparatus and storage medium of consumption contributed value |
CN108109063A (en) * | 2017-12-07 | 2018-06-01 | 上海点融信息科技有限责任公司 | For the method, apparatus and computer readable storage medium of prediction label predicted value |
CN109522372A (en) * | 2018-11-21 | 2019-03-26 | 北京交通大学 | The prediction technique of civil aviaton field passenger value |
CN109583949A (en) * | 2018-11-22 | 2019-04-05 | 中国联合网络通信集团有限公司 | A kind of user changes planes prediction technique and system |
CN109741114A (en) * | 2019-01-10 | 2019-05-10 | 博拉网络股份有限公司 | A kind of user under big data financial scenario buys prediction technique |
US20190228397A1 (en) * | 2018-01-25 | 2019-07-25 | The Bartley J. Madden Foundation | Dynamic economizer methods and systems for improving profitability, savings, and liquidity via model training |
CN110223166A (en) * | 2019-06-14 | 2019-09-10 | 哈尔滨哈银消费金融有限责任公司 | The prediction technique and equipment of consumer finance user's overdue loan based on big data |
CN111144935A (en) * | 2019-12-17 | 2020-05-12 | 武汉海云健康科技股份有限公司 | Big data-based sleep member awakening method and system, server and medium |
-
2020
- 2020-06-20 CN CN202010569862.1A patent/CN111913940B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107437199A (en) * | 2017-06-16 | 2017-12-05 | 北京小度信息科技有限公司 | Platform earnings forecast method and device |
CN107665448A (en) * | 2017-09-29 | 2018-02-06 | 北京京东尚科信息技术有限公司 | For determining the method, apparatus and storage medium of consumption contributed value |
CN108109063A (en) * | 2017-12-07 | 2018-06-01 | 上海点融信息科技有限责任公司 | For the method, apparatus and computer readable storage medium of prediction label predicted value |
US20190228397A1 (en) * | 2018-01-25 | 2019-07-25 | The Bartley J. Madden Foundation | Dynamic economizer methods and systems for improving profitability, savings, and liquidity via model training |
CN109522372A (en) * | 2018-11-21 | 2019-03-26 | 北京交通大学 | The prediction technique of civil aviaton field passenger value |
CN109583949A (en) * | 2018-11-22 | 2019-04-05 | 中国联合网络通信集团有限公司 | A kind of user changes planes prediction technique and system |
CN109741114A (en) * | 2019-01-10 | 2019-05-10 | 博拉网络股份有限公司 | A kind of user under big data financial scenario buys prediction technique |
CN110223166A (en) * | 2019-06-14 | 2019-09-10 | 哈尔滨哈银消费金融有限责任公司 | The prediction technique and equipment of consumer finance user's overdue loan based on big data |
CN111144935A (en) * | 2019-12-17 | 2020-05-12 | 武汉海云健康科技股份有限公司 | Big data-based sleep member awakening method and system, server and medium |
Non-Patent Citations (1)
Title |
---|
李征仁;张晓航;王光辉;石东海;石文华;: "基于Pareto/NBD的用户价值预测模型研究", 北京邮电大学学报(社会科学版), no. 03, pages 7 - 14 * |
Also Published As
Publication number | Publication date |
---|---|
CN111913940B (en) | 2024-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11663493B2 (en) | Method and system of dynamic model selection for time series forecasting | |
CN108564286B (en) | Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation | |
US10387900B2 (en) | Methods and apparatus for self-adaptive time series forecasting engine | |
CN106485562B (en) | Commodity information recommendation method and system based on user historical behaviors | |
US20210103858A1 (en) | Method and system for model auto-selection using an ensemble of machine learning models | |
CN108665311B (en) | Electric commercial user time-varying feature similarity calculation recommendation method based on deep neural network | |
CN113469730A (en) | Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene | |
CN110930179A (en) | Task evaluation method, system, device and computer readable storage medium | |
CN115063035A (en) | Customer evaluation method, system, equipment and storage medium based on neural network | |
Denk et al. | Avoid filling Swiss cheese with whipped cream: imputation techniques and evaluation procedures for cross-country time series | |
CN110544052A (en) | method and device for displaying relationship network diagram | |
CN112434862B (en) | Method and device for predicting financial dilemma of marketing enterprises | |
CN113554350A (en) | Activity evaluation method and apparatus, electronic device and computer readable storage medium | |
US11960499B2 (en) | Sales data processing apparatus, method, and medium storing program for sales prediction | |
CN111913940B (en) | Temperature membership tag prediction method and device, electronic equipment and storage medium | |
Bernat et al. | Modelling customer lifetime value in a continuous, non-contractual time setting | |
Shanti et al. | Machine Learning-Powered Mobile App for Predicting Used Car Prices | |
US20230230143A1 (en) | Product recommendation system, product recommendation method, and recordingmedium storing product recommendation program | |
RU2480828C1 (en) | Method of predicting target value of events based on unlimited number of characteristics | |
CN114612132A (en) | Client renewal prediction method based on machine learning and related equipment | |
CN114817741A (en) | Financial product accurate recommendation method and device | |
CN114282951A (en) | Network retail prediction method, equipment and medium | |
KR102499687B1 (en) | E-commerce product sales store management system based on bigdata and mehotd of automatic analysis of the seller's product page thereof | |
Abbassy | Using Machine Learning Technique for Analytical Customer Loyalty | |
CN113763032B (en) | Commodity purchase intention recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |