CN111913940A - Temperature member label prediction method and device, electronic equipment and storage medium - Google Patents

Temperature member label prediction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111913940A
CN111913940A CN202010569862.1A CN202010569862A CN111913940A CN 111913940 A CN111913940 A CN 111913940A CN 202010569862 A CN202010569862 A CN 202010569862A CN 111913940 A CN111913940 A CN 111913940A
Authority
CN
China
Prior art keywords
data
historical consumption
model
label
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010569862.1A
Other languages
Chinese (zh)
Other versions
CN111913940B (en
Inventor
黎云
袁冲
余军
沈章
吕静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Haiyun Health Technology Co ltd
Original Assignee
Wuhan Haiyun Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Haiyun Health Technology Co ltd filed Critical Wuhan Haiyun Health Technology Co ltd
Priority to CN202010569862.1A priority Critical patent/CN111913940B/en
Publication of CN111913940A publication Critical patent/CN111913940A/en
Application granted granted Critical
Publication of CN111913940B publication Critical patent/CN111913940B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0224Discounts or incentives, e.g. coupons or rebates based on user history

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Economics (AREA)
  • General Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Probability & Statistics with Applications (AREA)
  • General Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a temperature member label prediction method, a temperature member label prediction device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring member tag data and member historical consumption data from a big data platform data source; extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members; removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method; constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters; after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model. By the scheme, the problem that the existing temperature member label definition is inaccurate is solved, accurate definition of the member label can be realized, and consumption behaviors and values of the temperature member can be monitored and predicted in real time.

Description

Temperature member label prediction method and device, electronic equipment and storage medium
Technical Field
The invention relates to the field of medical big data, in particular to a temperature member label prediction method and device, electronic equipment and a storage medium.
Background
The quantity of members in the general medicine enterprise industry is large, the mobility is strong, thousands of members buy required medicines every day, and the real consumption willingness and consumption capacity of the members change all the time. For a manager of a medicine enterprise, a certain management standard is needed to monitor the index of real-time change of mass members in a store, namely member temperature, so that the conditions of the members in the store can be known in real time based on the member temperature, and the manager can make an accurate operation decision conveniently.
The temperature member label can be used for conveniently classifying the clients and determining the client value. At present, the definition of the temperature label of the member is often defined directly from the perspective of commercial profit, such as direct reaction of unit price or gross profit, and the temperature label defined in this way can only reflect the historical or current profit value of the member, while the accuracy of the defined temperature label is not high for the future member value.
Disclosure of Invention
In view of the above, embodiments of the present invention provide a temperature membership label prediction method, an apparatus, an electronic device and a storage medium, so as to solve the problem of inaccurate definition of a temperature membership label in the prior art.
In a first aspect of an embodiment of the present invention, a method for predicting a temperature member tag is provided, including:
acquiring member tag data and member historical consumption data from a big data platform data source;
extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method;
constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters;
after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model.
In a second aspect of an embodiment of the present invention, there is provided an apparatus for temperature membership tag prediction, including:
the acquisition module is used for acquiring member tag data and member historical consumption data from a big data platform data source;
the extraction module is used for extracting the historical consumption characteristics of the members and the corresponding predicted tag data according to the member tag data and the historical consumption data of the members;
the clearing module is used for clearing abnormal feature data based on a 3 sigma rule and selecting features through an embedded method;
the training module is used for constructing a ligatgbM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatgbMbM model and determining model parameters;
and the parameter adjusting module is used for substituting the optimal parameters into the ligatghgbm model after parameter adjustment through the GridSearch grid so as to predict the member label based on the ligatgbm model.
In a third aspect of the embodiments of the present invention, there is provided an electronic device, including a memory, a processor, and a computer program stored in the memory and executable by the processor, where the processor executes the computer program to implement the steps of the method according to the first aspect of the embodiments of the present invention.
In a fourth aspect of the embodiments of the present invention, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the steps of the method provided by the first aspect of the embodiments of the present invention.
In the embodiment of the invention, the behavior characteristics and the historical predicted values are extracted based on the historical consumption data of the members, after abnormal points of the historical characteristics are eliminated and the characteristics are selected, the ligagmb model is trained, parameters are adjusted through the GridSearch grid, and the member labels are predicted through the ligagmb model corresponding to the optimal parameters, so that the accurate depiction of the temperature members can be realized, the member behaviors can be conveniently predicted in real time, the member value maximization is realized by adjusting the marketing strategy, and the problem of low accuracy of the temperature member labels is solved. Compared with the traditional method of reflecting the user activity by indexes such as the average gross interest of a medicine enterprise, the method has the advantages that the carving granularity is more delicate and accurate, the defined label can directly and comprehensively reflect the client value, the label can be dynamically adjusted in real time according to the real-time behavior of the user, the purpose of dynamically monitoring the member behavior is achieved, and the practical value is higher.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart illustrating a method for predicting a temperature membership label according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a temperature membership tag prediction device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons skilled in the art without any inventive work shall fall within the protection scope of the present invention, and the principle and features of the present invention shall be described below with reference to the accompanying drawings.
The terms "comprises" and "comprising," when used in this specification and claims, and in the accompanying drawings and figures, are intended to cover non-exclusive inclusions, such that a process, method or system, or apparatus that comprises a list of steps or elements is not limited to the listed steps or elements.
The definition of the current member temperature is directly reflected from the commercial profit perspective, for example, the commercial profit sales volume of a medicine enterprise is directly reflected by the unit price or gross profit amount of a customer, the commercial value brought by each member cannot be reflected in the mode, the member temperature is mostly based on the financial statement published in the past, the future or real-time dynamic change of the member cannot be reflected, the future operation decision and direction help of a manager is limited, and meanwhile, the member label does not reflect the consumption behavior of the member. Therefore, the definition of the temperature member should integrate the historical consumption behavior of the member and reflect the value of the member in real time.
Referring to fig. 1, a flow chart of a temperature membership label prediction method according to an embodiment of the present invention includes:
s101, acquiring member tag data and member historical consumption data from a big data platform data source;
the big data platform data source may be medicine sales data and member data stored in a server of a data platform such as a related medicine website or an APP. After the member purchases the medicine, the consumption information of the member can be input into the platform, and the sales data can be conveniently analyzed based on the input and storage of mass sales data.
The member tag data generally includes a member card number, points, last consumption time, member consumption level, and the like, and may further include member personal information such as name, gender, age, occupation, place of birth, place of residence, family members, and work units.
The member historical consumption data may be member consumption data of a time period before the current time, such as one month, six months, one year, and the like, which is not limited herein. The member historical consumption data can comprise consumption time, money, medicine names, grades, places and the like, the member consumption behaviors can be analyzed and predicted based on the member historical consumption behavior characteristics, and then the member value is evaluated
S102, extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
the predictive tag data generally refers to a value tag of a member, and can be a specific calculated value for measuring the value of the temperature member.
In one embodiment, the member historical consumption data is divided into a first section of historical consumption data and a second section of historical consumption data; and performing statistical analysis on the first segmented historical consumption data through the Spark cluster to obtain the historical consumption behavior characteristics of the members and generate multi-dimensional user tag data. It can be understood that the first segment historical consumption data is used for extracting user consumption behavior characteristics, the second segment historical consumption data is used for extracting a value label of a member, and the member value label prediction can be performed through the model by training the model according to the first segment historical consumption data and the second segment historical consumption data.
Illustratively, taking historical consumption data of 12 months as an example, the statistical characteristics are aggregated according to members for 1-11 months before Spark cluster analysis, and the statistical characteristics comprise: the last consumption is far from the current time, the total frequency of user consumption, the average amount of each consumption of the user, and according to the time difference, the average consumption times of the user to 64 categories, the frequency interval of medicine purchase of the user, the time interval and the amount interval, the maximum, minimum, average and standard deviation purchase frequency, the maximum, minimum, average and standard deviation purchase amount, the maximum, teq minimum, average and standard deviation purchase time and other statistical characteristics are respectively counted. Based on the statistical characterization describing the user behavior, multi-dimensional user tag data may be generated.
Preferably, the medicine purchasing habits of the users are subjected to feature extraction through word2vec, and medicine purchasing commodities are aggregated according to the time dimension of member medicine purchasing. And vectorizing and expressing commodity text of medicine purchasing through the word2vec so as to acquire the behavior characteristics of a member purchasing a specific commodity.
In one embodiment, calculating the unit price, the total consumption amount, the consumption times and the gross profit of customers corresponding to the historical consumption data of the second section of users; and taking the unit price of the passenger, the total consumption amount, the consumption times and gross profit as characteristic factors, weighting the characteristic factors according to time attenuation factors to obtain a label value, wherein the label value is a member label prediction result and is used for measuring the member value.
S103, removing abnormal feature data based on a 3 sigma rule, and selecting features through an embedded method;
the 3 sigma rule refers to that interference or noise of singular data is difficult to meet normal distribution caused by the fact that the interference or the noise of the singular data is established on the basis of equal-precision repeated measurement of the normal distribution, if the absolute value vi of a residual error of a certain measured value in a group of measured data is larger than 3 sigma, the measured value is a bad value and should be eliminated, and the error which is equal to +/-3 sigma can be generally used as a limit error. And eliminating the abnormal points by a 3 sigma rule (P (| x-u | >3 sigma) < ═ 0.003), reducing the interference of abnormal characteristic data and ensuring the accuracy of characteristic selection.
When the feature data is sampled, the ratio of positive and negative samples can be 1: 5.
For the constructed tree prediction model, because the tree model is insensitive to normalization and discretization, default value supplement and log data transformation are carried out on the tree model, and the completeness and reliability of characteristic data are ensured.
For some labels and features, the distribution does not necessarily conform to the normal distribution, but in the actual operation process, data can conform to the normal distribution, so that log change needs to be carried out on the features, and the data can conform to the normal distribution to a certain extent.
The embedded method is embedded selection, and features are selected by learning the contribution of each feature to the model accuracy. An embedded method is constructed for feature selection, the features of the first 90% of the weight proportion can be reserved, and then the model is trained based on the features. The embedded method can effectively reduce the load pressure of the server for processing mass data.
S104, constructing a ligatghGBM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatghGMB model, and determining model parameters;
the ligatghgbm algorithm grows the tree through a leaf-wise strategy, and the leaf with the largest splitting gain is found from all the current leaves every time, so that the error can be effectively reduced, and the overfitting of the model is avoided.
Preferably, for the characteristic data after washing, the ratio of 8: and 2, dividing the ratio into a training set and a verification set, and obtaining the optimal iteration times best _ n _ estimators under the condition of ensuring the running speed by utilizing a lightgbm native interface lgb.train () under the condition that the learning rate is higher than 0.1.
And S105, after parameter adjustment is carried out through the GridSearch grid, substituting the optimal parameters into the ligagGBM model, and predicting the member label based on the ligagGBM model.
The GridSearch is a parameter adjusting method, and the parameter with the best performance is selected as a result through exhaustive search of candidate parameters. Specifically, the AUC (Area Under the dark) is used as a model evaluation index, k-fold cross validation is adopted to calculate the optimal parameters of the ligatgbM model, the learning rate of the ligatgbM model is reduced, and the optimal iteration times are obtained.
Illustratively, auc is used as an evaluation index to sequentially optimize (1) a maximum depth max _ depth, a leaf node number num _ leave (2), a leaf node minimum sample number min _ child _ samples, a leaf node minimum sample number weight min _ child _ weight (3), a sample sampling proportion bagging _ fraction, a feature sampling proportion feature _ fraction (4), a regularization parameter L1 reg _ alpha and a regularization parameter L2 reg _ lambda, and then the learning rate is reduced to obtain an optimal iteration number best _ n _ indicators by adopting a grid search 5-fold cross validation method.
And substituting the result of the adjustment parameter into a lightgbm model, then storing the model, and predicting the member label through the lightgbm model.
By the method provided by the embodiment, the granularity of depicting the member temperature is more detailed and accurate, the activity degree of the user in a future period of time (such as one month) is simulated in real time based on the characteristics, the time series characteristics, the sales statistical characteristics and the like of the member, the index of the activity degree can be dynamically adjusted in real time along with the real-time consumption behavior of the user, the purpose of dynamically monitoring the member behavior is achieved, and the maximization of the member value can be further realized through a marketing strategy for the member with higher temperature index.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 2 is a schematic structural diagram of a temperature membership tag prediction apparatus according to an embodiment of the present invention, the apparatus including:
the obtaining module 210 is used for obtaining member tag data and member historical consumption data from a big data platform data source;
the extracting module 220 is configured to extract the member historical consumption characteristics and the corresponding predicted tag data according to the member tag data and the member historical consumption data;
specifically, the extracting module 220 includes:
the dividing unit is used for dividing the member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and the counting unit is used for carrying out statistical analysis on the first subsection historical consumption data through the Spark cluster, acquiring the historical consumption behavior characteristics of the member and generating the multi-dimensional user tag data.
Optionally, the extracting module 220 further includes:
and the aggregation unit is used for performing characteristic extraction on the medicine purchasing habits of the users through word2vec and aggregating the medicine purchasing commodities according to the time dimension of member medicine purchasing.
Further, the dividing unit further includes:
the calculating unit is used for calculating the unit price, the total consumption amount, the consumption times and the gross profit of the customers corresponding to the historical consumption data of the second section of users;
and the calculating unit is used for weighting the characteristic factors according to the time attenuation factors by taking the unit price of the passenger, the total consumption amount, the consumption times and the gross profit as the characteristic factors to calculate the label value, wherein the label value is a member label prediction result and is used for measuring the member value.
The clearing module 230 is used for clearing the abnormal feature data based on the 3 sigma rule and selecting features through an embedded method;
the training module 240 is used for constructing a ligatgmb algorithm, selecting cleaned feature data with a predetermined proportion as training data to train the ligatgmb model, and determining model parameters;
and a parameter adjusting module 250, configured to substitute the optimal parameter into the ligatghgbm model after adjusting the parameters through the GridSearch grid, so as to perform member tag prediction based on the ligatgbm model.
Optionally, the adjusting parameters through the GridSearch grid includes:
calculating the optimal parameters of the ligathGBM model by using the AUC as a model evaluation index and adopting k-fold cross validation; and reducing the learning rate of the ligathGBM model and obtaining the optimal iteration times.
In one embodiment of the present invention, an electronic device for temperature membership tag prediction is provided, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps S101 to S105 as in embodiments of the present invention when executing the computer program.
There is also provided in an embodiment of the present invention a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the temperature membership tag prediction method provided in the above embodiment, the non-transitory computer readable storage medium including: ROM/RAM, magnetic disk, optical disk, etc.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for predicting a temperature membership label, comprising:
acquiring member tag data and member historical consumption data from a big data platform data source;
extracting historical consumption characteristics of the members and corresponding predicted label data according to the member label data and the historical consumption data of the members;
removing abnormal feature data based on a 3 sigma rule, and selecting features by an embedded method;
constructing a ligathGBM algorithm, selecting cleaned characteristic data in a preset proportion as training data to train a ligathGMB model, and determining model parameters;
after parameters are adjusted through GridSearch grids, the optimal parameters are substituted into the ligagGBM model, and member label prediction is carried out based on the ligagBM model.
2. The method of claim 1, wherein extracting the member historical consumption characteristics and the corresponding predictive tag data according to the member tag data and the member historical consumption data comprises:
dividing member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and performing statistical analysis on the first segmented historical consumption data through the Spark cluster to obtain the historical consumption behavior characteristics of the members and generate multi-dimensional user tag data.
3. The method of claim 1, wherein extracting the member historical consumption characteristics and the corresponding predictive tag data according to the member tag data and the member historical consumption data further comprises:
and performing characteristic extraction on the medicine purchasing habits of the users through word2vec, and aggregating medicine purchasing commodities according to the time dimension of member medicine purchasing.
4. The method of claim 2, wherein the dividing the member historical consumption data into a first segmented historical consumption data and a second segmented historical consumption data further comprises:
calculating the unit price, total consumption amount, consumption times and gross profit of the customers corresponding to the historical consumption data of the second section of users;
and taking the unit price of the passenger, the total consumption amount, the consumption times and gross profit as characteristic factors, weighting the characteristic factors according to time attenuation factors to obtain a label value, wherein the label value is a member label prediction result and is used for measuring the member value.
5. The method of claim 1, wherein the tuning the parameters through the GridSearch trellis comprises:
calculating the optimal parameters of the ligathGBM model by using the AUC as a model evaluation index and adopting k-fold cross validation;
and reducing the learning rate of the ligathGBM model and obtaining the optimal iteration times.
6. An apparatus for temperature member tag prediction, comprising:
the acquisition module is used for acquiring member tag data and member historical consumption data from a big data platform data source;
the extraction module is used for extracting the historical consumption characteristics of the members and the corresponding predicted tag data according to the member tag data and the historical consumption data of the members;
the clearing module is used for clearing abnormal feature data based on a 3 sigma rule and selecting features through an embedded method;
the training module is used for constructing a ligatgbM algorithm, selecting cleaned feature data with a preset proportion as training data to train the ligatgbMbM model and determining model parameters;
and the parameter adjusting module is used for substituting the optimal parameters into the ligatghgbm model after parameter adjustment through the GridSearch grid so as to predict the member label based on the ligatgbm model.
7. The apparatus of claim 6, wherein the extraction module comprises:
the dividing unit is used for dividing the member historical consumption data into first subsection historical consumption data and second subsection historical consumption data;
and the counting unit is used for carrying out statistical analysis on the first subsection historical consumption data through the Spark cluster, acquiring the historical consumption behavior characteristics of the member and generating the multi-dimensional user tag data.
8. The apparatus of claim 7, wherein the dividing unit further comprises:
the calculating unit is used for calculating the unit price, the total consumption amount, the consumption times and the gross profit of the customers corresponding to the historical consumption data of the second section of users;
and the calculating unit is used for weighting the characteristic factors according to the time attenuation factors by taking the unit price of the passenger, the total consumption amount, the consumption times and the gross profit as the characteristic factors to calculate the label value, wherein the label value is a member label prediction result and is used for measuring the member value.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method for temperature membership tag prediction according to any of claims 1 to 5.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the steps of the method for temperature membership tag prediction according to any one of claims 1 to 5.
CN202010569862.1A 2020-06-20 2020-06-20 Temperature membership tag prediction method and device, electronic equipment and storage medium Active CN111913940B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010569862.1A CN111913940B (en) 2020-06-20 2020-06-20 Temperature membership tag prediction method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010569862.1A CN111913940B (en) 2020-06-20 2020-06-20 Temperature membership tag prediction method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111913940A true CN111913940A (en) 2020-11-10
CN111913940B CN111913940B (en) 2024-04-26

Family

ID=73226101

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010569862.1A Active CN111913940B (en) 2020-06-20 2020-06-20 Temperature membership tag prediction method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111913940B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107437199A (en) * 2017-06-16 2017-12-05 北京小度信息科技有限公司 Platform earnings forecast method and device
CN107665448A (en) * 2017-09-29 2018-02-06 北京京东尚科信息技术有限公司 For determining the method, apparatus and storage medium of consumption contributed value
CN108109063A (en) * 2017-12-07 2018-06-01 上海点融信息科技有限责任公司 For the method, apparatus and computer readable storage medium of prediction label predicted value
CN109522372A (en) * 2018-11-21 2019-03-26 北京交通大学 The prediction technique of civil aviaton field passenger value
CN109583949A (en) * 2018-11-22 2019-04-05 中国联合网络通信集团有限公司 A kind of user changes planes prediction technique and system
CN109741114A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user under big data financial scenario buys prediction technique
US20190228397A1 (en) * 2018-01-25 2019-07-25 The Bartley J. Madden Foundation Dynamic economizer methods and systems for improving profitability, savings, and liquidity via model training
CN110223166A (en) * 2019-06-14 2019-09-10 哈尔滨哈银消费金融有限责任公司 The prediction technique and equipment of consumer finance user's overdue loan based on big data
CN111144935A (en) * 2019-12-17 2020-05-12 武汉海云健康科技股份有限公司 Big data-based sleep member awakening method and system, server and medium

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107437199A (en) * 2017-06-16 2017-12-05 北京小度信息科技有限公司 Platform earnings forecast method and device
CN107665448A (en) * 2017-09-29 2018-02-06 北京京东尚科信息技术有限公司 For determining the method, apparatus and storage medium of consumption contributed value
CN108109063A (en) * 2017-12-07 2018-06-01 上海点融信息科技有限责任公司 For the method, apparatus and computer readable storage medium of prediction label predicted value
US20190228397A1 (en) * 2018-01-25 2019-07-25 The Bartley J. Madden Foundation Dynamic economizer methods and systems for improving profitability, savings, and liquidity via model training
CN109522372A (en) * 2018-11-21 2019-03-26 北京交通大学 The prediction technique of civil aviaton field passenger value
CN109583949A (en) * 2018-11-22 2019-04-05 中国联合网络通信集团有限公司 A kind of user changes planes prediction technique and system
CN109741114A (en) * 2019-01-10 2019-05-10 博拉网络股份有限公司 A kind of user under big data financial scenario buys prediction technique
CN110223166A (en) * 2019-06-14 2019-09-10 哈尔滨哈银消费金融有限责任公司 The prediction technique and equipment of consumer finance user's overdue loan based on big data
CN111144935A (en) * 2019-12-17 2020-05-12 武汉海云健康科技股份有限公司 Big data-based sleep member awakening method and system, server and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李征仁;张晓航;王光辉;石东海;石文华;: "基于Pareto/NBD的用户价值预测模型研究", 北京邮电大学学报(社会科学版), no. 03, pages 7 - 14 *

Also Published As

Publication number Publication date
CN111913940B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
US11663493B2 (en) Method and system of dynamic model selection for time series forecasting
CN108564286B (en) Artificial intelligent financial wind-control credit assessment method and system based on big data credit investigation
US10387900B2 (en) Methods and apparatus for self-adaptive time series forecasting engine
CN106485562B (en) Commodity information recommendation method and system based on user historical behaviors
US20210103858A1 (en) Method and system for model auto-selection using an ensemble of machine learning models
CN108665311B (en) Electric commercial user time-varying feature similarity calculation recommendation method based on deep neural network
CN113469730A (en) Customer repurchase prediction method and device based on RF-LightGBM fusion model under non-contract scene
CN110930179A (en) Task evaluation method, system, device and computer readable storage medium
CN115063035A (en) Customer evaluation method, system, equipment and storage medium based on neural network
Denk et al. Avoid filling Swiss cheese with whipped cream: imputation techniques and evaluation procedures for cross-country time series
CN110544052A (en) method and device for displaying relationship network diagram
CN112434862B (en) Method and device for predicting financial dilemma of marketing enterprises
CN113554350A (en) Activity evaluation method and apparatus, electronic device and computer readable storage medium
US11960499B2 (en) Sales data processing apparatus, method, and medium storing program for sales prediction
CN111913940B (en) Temperature membership tag prediction method and device, electronic equipment and storage medium
Bernat et al. Modelling customer lifetime value in a continuous, non-contractual time setting
Shanti et al. Machine Learning-Powered Mobile App for Predicting Used Car Prices
US20230230143A1 (en) Product recommendation system, product recommendation method, and recordingmedium storing product recommendation program
RU2480828C1 (en) Method of predicting target value of events based on unlimited number of characteristics
CN114612132A (en) Client renewal prediction method based on machine learning and related equipment
CN114817741A (en) Financial product accurate recommendation method and device
CN114282951A (en) Network retail prediction method, equipment and medium
KR102499687B1 (en) E-commerce product sales store management system based on bigdata and mehotd of automatic analysis of the seller&#39;s product page thereof
Abbassy Using Machine Learning Technique for Analytical Customer Loyalty
CN113763032B (en) Commodity purchase intention recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant