CN109508807A - Lottery user liveness prediction technique, system and terminal device, storage medium - Google Patents

Lottery user liveness prediction technique, system and terminal device, storage medium Download PDF

Info

Publication number
CN109508807A
CN109508807A CN201810840895.8A CN201810840895A CN109508807A CN 109508807 A CN109508807 A CN 109508807A CN 201810840895 A CN201810840895 A CN 201810840895A CN 109508807 A CN109508807 A CN 109508807A
Authority
CN
China
Prior art keywords
user
liveness
data
lottery
predicted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810840895.8A
Other languages
Chinese (zh)
Inventor
韩旭
宋骁程
肖文晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cloud Number Information Technology (shenzhen) Co Ltd
Original Assignee
Cloud Number Information Technology (shenzhen) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cloud Number Information Technology (shenzhen) Co Ltd filed Critical Cloud Number Information Technology (shenzhen) Co Ltd
Priority to CN201810840895.8A priority Critical patent/CN109508807A/en
Publication of CN109508807A publication Critical patent/CN109508807A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Development Economics (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Time Recorders, Dirve Recorders, Access Control (AREA)

Abstract

The invention discloses a kind of lottery user liveness prediction techniques, comprising: obtains original user data, after the original user data is extracted and converted, is loaded onto database with specified format classification;The original user data stored in the database is pre-processed, various dimensions user data is obtained;Predicted characteristics collection relevant to user activity is obtained according to the various dimensions user data;By in predicted characteristics collection input in advance the liveness prediction model based on GBDT algorithm of training, user activity is predicted.Correspondingly, the invention also discloses a kind of lottery user liveness forecasting system and terminal devices, computer readable storage medium.It can reduce the prediction difficulty of lottery user liveness using technical solution of the present invention, and improve predictablity rate.

Description

Lottery user liveness prediction technique, system and terminal device, storage medium
Technical field
The present invention relates to data mining technology field more particularly to a kind of lottery user liveness prediction technique, system and Terminal device, computer readable storage medium.
Background technique
Betting office's turnover is directly linked up with number of users, user activity and wager amounts.Lottery industry user base number Greatly, but user activity is irregular, and high liveness user is the main foreigner tourists that the turnover is contributed for betting office, identifies potential High development user, and stimulate it to play an active part in lottery ticket by the means of customer relation management, become high liveness user, just by Step is paid attention to.
The prior art provides some technical solutions about user activity prediction, but existing user activity is predicted System is mainly concentrated use in the Internet applications scenes such as Yu Shouyou, page trip and WEB service, and in these application scenarios, user is living Main indicator used in the definition of jerk is user's login times and accumulated recharge amount etc., for example, will be preset It is continuously logged in 7 times in period (such as 10 days), and user of the accumulated recharge amount greater than 10,000 is defined as high any active ues, in this way Be defined on it is clearly inappropriate in lottery industry, the user activity of lottery industry define answer emphasis consider user participation Rate.
It is existing to be used to predict that the model of user activity mainly has the rule model based on experience and statistics, for example, will The user in predicting logged in more than continuous 5 days is high liveness user, and such model is facing lottery industry magnanimity, multiplicity, answering It is difficult to extract accurate rules out when miscellaneous data to predict the activity of the user;It is active that regression analysis is widely used in user Degree prediction, but to the quality requirement of training data height, the synteny problem in independent variable need to be excluded, and reasonably handle exceptional value And default value, and lottery user data source is complicated extensively, often exist it is abnormal with it is default, simple regression analysis can not Obtain accurate prediction result.In addition, neural network is also common prediction model, it is the input/output list of one group of connection Member, wherein each connection has a weighted value, the classificating knowledge of neural network is embodied over network connections, is implicitly stored In the weight of connection, the learning process of neural network is the process being constantly adjusted by interative computation to weight, study Target be exactly to make to input tuple by the adjustment of weight by correct label, compare other common data mining technologies, mind There is good predictive ability for the activity of the user prediction through network, but its disadvantage also can not be ignored, such as nerve net The black box of network itself is not easy to explain and to computing capability high request etc., causes the difficulty of prediction larger.
Summary of the invention
The technical problem to be solved by the embodiment of the invention is that providing a kind of lottery user liveness prediction technique, being System and terminal device, computer readable storage medium, can reduce the prediction difficulty of lottery user liveness, and it is quasi- to improve prediction True rate.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of lottery user liveness prediction technique, packets It includes:
Original user data is obtained, after the original user data is extracted and converted, is added with specified format classification It is loaded onto database;
The original user data stored in the database is pre-processed, various dimensions user data is obtained;Its In, the pretreatment includes at least consistency treatment, removes processing again, data transformation and data reduction process;The various dimensions are used User data includes at least the personal information of user, history stake information and history earnings information;
Predicted characteristics collection relevant to user activity is obtained according to the various dimensions user data;
By in predicted characteristics collection input in advance the liveness prediction model based on GBDT algorithm of training, live to user Jerk is predicted.
It is further, described that predicted characteristics collection relevant to user activity is obtained according to the various dimensions user data, It specifically includes:
Potential feature relevant to user activity is constructed from the various dimensions user data according to data statistic analysis Collection;
The potential feature set is adjusted, screened and combined according to iteration tests, obtains the predicted characteristics collection.
Further, the method instructs the liveness prediction model based on GBDT algorithm by following steps Practice:
The training characteristics collection relevant to user activity being obtained ahead of time is divided into training set and verifying collection;
It is modeled based on GBDT algorithm according to the training set, obtains at least two initial predicted models;
Obtain the accuracy rate of each initial predicted model respectively according to the verifying collection;
Select the highest initial predicted model of accuracy rate as the liveness prediction model based on GBDT algorithm.
Further, described to be modeled based on GBDT algorithm according to the training set, obtain at least two initial predicteds Model specifically includes:
Different at least two groups modeling parameters are chosen from the training set according to grid data service;
It is modeled based on GBDT algorithm according at least two groups modeling parameters, corresponding acquisition at least two is described initial Prediction model.
Further, the liveness based on GBDT algorithm by predicted characteristics collection input training in advance predicts mould In type, user activity is predicted, is specifically included:
The predicted characteristics collection is inputted in the liveness prediction model based on GBDT algorithm, is obtained in preset time User activity scoring in section.
Further, the method also includes:
The prediction result of user activity is exported in a text form, and is shown in the form of WEB page.
Further, the method also includes:
Obtain actual user's liveness;
The accuracy rate of the liveness prediction model based on GBDT algorithm is verified according to actual user's liveness;
The liveness prediction model based on GBDT algorithm is corrected according to verification result.
In order to solve the above-mentioned technical problem, the embodiment of the invention also provides a kind of lottery user liveness forecasting system, Include:
Data loading module, for obtaining original user data, after the original user data is extracted and is converted, It is loaded onto database with specified format classification;
Data processing module is obtained for pre-processing to the original user data stored in the database Various dimensions user data;Wherein, the pretreatment includes at least consistency treatment, removes processing again, data transformation and data regularization Processing;The various dimensions user data includes at least the personal information of user, history stake information and history earnings information;
Characteristic extracting module, for obtaining predicted characteristics relevant to user activity according to the various dimensions user data Collection;And
Liveness prediction module, the liveness based on GBDT algorithm for training predicted characteristics collection input in advance In prediction model, user activity is predicted.
In order to solve the above-mentioned technical problem, the embodiment of the invention also provides a kind of terminal devices, including processor, storage Device and storage in the memory and are configured as the computer program executed by the processor, and the processor is being held Lottery user liveness prediction technique described in any of the above embodiments is realized when the row computer program.
In order to solve the above-mentioned technical problem, described the embodiment of the invention also provides a kind of computer readable storage medium Computer readable storage medium includes the computer program of storage;Wherein, the computer program controls the meter at runtime Equipment where calculation machine readable storage medium storing program for executing executes lottery user liveness prediction technique described in any of the above embodiments.
Compared with prior art, the embodiment of the invention provides a kind of lottery user liveness prediction technique, system and ends End equipment, computer readable storage medium extract original user data and are converted by obtaining original user data Afterwards, it is loaded onto database with specified format classification;The original user data stored in database is pre-processed, is obtained more Dimension user data, and predicted characteristics collection relevant to user activity is obtained according to various dimensions user data;By predicted characteristics Collection input in the liveness prediction model based on GBDT algorithm of training, is predicted user activity, be can reduce in advance The prediction difficulty of lottery user liveness, and improve predictablity rate.
Detailed description of the invention
Fig. 1 is a kind of flow chart of a preferred embodiment of lottery user liveness prediction technique provided by the invention;
Fig. 2 is a preferred embodiment of the step S13 of lottery user liveness prediction technique provided by the invention a kind of Specific flow chart;
Fig. 3 is an a kind of preferred embodiment of the model training of lottery user liveness prediction technique provided by the invention Specific flow chart;
Fig. 4 is a preferred embodiment of the step S22 of lottery user liveness prediction technique provided by the invention a kind of Specific flow chart;
Fig. 5 is a kind of structural frames of a preferred embodiment of lottery user liveness forecasting system provided by the invention Figure;
Fig. 6 is a kind of structural block diagram of a preferred embodiment of terminal device provided by the invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained all without creative efforts Other embodiments shall fall within the protection scope of the present invention.
It should be strongly noted that " lottery ticket " in the embodiment of the present invention only refers in particular to meet legalizing for government operation A kind of economic activity form.
It is shown in Figure 1, it is an a kind of preferred embodiment of lottery user liveness prediction technique provided by the invention Flow chart, including step S11 to step S14:
Step S11, original user data is obtained, after the original user data is extracted and converted, to specify lattice Formula classification is loaded onto database.
Specifically, step S11 after the original user data of lottery user is ready, identify and obtain do not load it is original User data is carried out the extraction of necessary data and the conversion of data format to the original user data, and is classified with specified format It stores in database, to realize that database loads.
It being designed by data pattern, original user data will be organized into multiple succinct, efficient, comprehensive tables of data, under Face by original user data crucial table and critical field for be introduced.
For crucial table, being designed by data pattern, original user data is divided into two class tables and is converted and loaded, Middle one kind is the history table of sustainable addition, and another kind of is the information reference list for needing full table replacement.For critical field, It include the related content for the critical field being used in crucial table.As shown in table 1, a crucial table and its meaning are listed Justice lists critical field account information change record (account_info_change_log) in crucial table as shown in table 2 And its meaning.
The crucial table of table 1 and its meaning
Table Meaning
account_info_change_log Account information change record
fund_transfer_log Account supplements record of withdrawing deposit with money
balance_log Account balance historical record
betting_log Account bets historical record
customer_info Userspersonal information's table
event_schedule Schedule table
bet_type Bet type table
account_info Account up-to-date information table
2 account information change record of table and its meaning
It should be noted that the original user data comprising necessary information in a manner of " data category+time range " into Row is sorted out, and is uniformly placed in HDFS (Hadoop Distributed File System, Hadoop distributed file system) Specified path under, newly-increased original user data will periodically be added to the path, to carry out the update of prediction model.
Step S12, the original user data stored in the database is pre-processed, obtains various dimensions user Data;Wherein, the pretreatment includes at least consistency treatment, removes processing again, data transformation and data reduction process;It is described more Dimension user data includes at least the personal information of user, history stake information and history earnings information.
It should be noted that typically, original user data there are it is inconsistent, repeat, dimension is high the problems such as, original After the completion of user data load, need to pre-process original user data, to solve the problems, such as to exist.
Specifically, step S12 carries out consistency treatment to the original user data stored in database, except processing, number again According to transformation and the pretreatment such as data reduction process, to obtain various dimensions user data, which includes but not It is limited to userspersonal information, user's history stake information and the user's history earnings information of desensitization.
Wherein, the situation inconsistent for data can be accepted or rejected: 1) newest with the time according to following several standards Data are excellent (such as membership information data);2) with the stronger data of reliability be it is excellent (such as the same information involved in multiple tables, The data being subject in more reliable table);3) value that can be calculated actively calculates acquisition, and (such as the age can be calculated by the date of birth It obtains);If there is the inconsistent situation of serious data (few to occur), which can be considered as to abnormal user and ignored.Needle To the duplicate situation of data, can carry out filtering duplicate data except handling again.It, can be with for the higher situation of data dimension Data transformation and the data reduction process such as assembled, standardized to original user data.
Step S13, predicted characteristics collection relevant to user activity is obtained according to the various dimensions user data.
It should be noted that Feature Engineering is the process for initial data being transformed into feature, these features should be able to be very The model for describing data well, and being established using them can reaction model target well, therefore, in the present embodiment, Feature set needs to reflect lottery user liveness, in conjunction with lottery industry feature, provides suitable for the definition of lottery user liveness such as Under:
At a given time period in T, in all users for participating in lottery ticket, participate in bet in stake issue and T time total The ratio of issue, as lottery user liveness.Wherein, when user activity is greater than 60%, it is believed that the user is high active Spend user.
Specifically, the various dimensions user data obtained after pre-processing to original user data includes but is not limited to desensitize Userspersonal information, user's history stake information and user's history earnings information, step S13 is then according to the various dimensions number of users According to acquisition predicted characteristics collection relevant to user activity.
Step S14, in the liveness prediction model based on GBDT algorithm for training predicted characteristics collection input in advance, User activity is predicted.
Specifically, the present invention has been previously-completed the training to the liveness prediction model based on GBDT algorithm, obtain with After the relevant predicted characteristics collection of user activity, by step S14 by the predicted characteristics collection be input to training after based on In the liveness prediction model of GBDT algorithm, to predict user activity.
A kind of lottery user liveness prediction technique provided by the embodiment of the present invention, passes through the original use to lottery user User data carry out respective handling to obtain predicted characteristics collection relevant to user activity, after combined training based on GBDT algorithm Liveness prediction model, realize prediction to user activity, reduce the prediction difficulty of lottery user liveness, and improve The accuracy rate of user activity prediction.In addition, since GBDT algorithm can flexibly handle various types of data, such as low-dimensional Degree evidence, linearity and non-linearity data and continuous Value Data and discrete Value Data, and used the loss letter of some stalwartnesses Number, such as Huber loss function and Quantile loss function, it is very strong to the robustness of exceptional value, therefore further increase The precision of prediction of user activity.
It is shown in Figure 2, it is that one of the step S13 of a kind of lottery user liveness prediction technique provided by the invention is excellent The specific flow chart of embodiment is selected, it is described that predicted characteristics relevant to user activity are obtained according to the various dimensions user data Collection, specifically includes step S1301 to step S1302:
Step S1301, it is constructed from the various dimensions user data according to data statistic analysis related to user activity Potential feature set;
Step S1302, the potential feature set is adjusted, screened and combined according to iteration tests, obtained described pre- Survey feature set.
In the present embodiment, according to all kinds of statistical analysis of the study of domain knowledge early period and lottery user data, from more Many potential feature sets relevant to user activity (such as lottery user nearest phase time has been constructed in dimension user data Report rate), then by constantly iteration tests, the parameter in potential feature set is adjusted, screened and combined, has been reached most Excellent effect, to obtain final predicted characteristics collection.
When it is implemented, can be by operation characteristic processor (Feature Processor), in all various dimensions users The feature filtered out is constructed one by one on the basis of data;Feature Processor can be understood as being responsible for generating special The driver of sign can correspond to acquisition one when every feature executed under cubbyhole feature-maker generates code file Category feature has feature set scalability;After the driver end of run, so that it may obtain the feature formed after new data is added Collection.Table 3 lists userspersonal information's feature set, user's history transaction feature collection and user to table 5 in table form respectively Trade mode feature set.
3 userspersonal information's feature set of table
Feature Description
GENDER Lottery user gender
AGE The lottery user age
MAJOR_CHANNEL Lottery user mainly bets channel
BET_YEAR Lottery user participates in stake time (as unit of year)
charGE_AMOUNT Lottery user supplements total amount (nearly 60 stake days) with money
charGE_TIMES Lottery user supplements number (nearly 60 stake days) with money
4 user's history transaction feature collection of table
Feature Description
INV_LAST_PERIOD A stake period stake total value in lottery user
DIV_LAST_PERIOD A stake period handsel total value in lottery user
RECOVERY_RATE_LAST_PERIOD A stake period return rate in lottery user
INV_LONGTERM The long-term stake total value of lottery user (nearly 60 stake days)
DIV_LONGTERM The long-term handsel total value of lottery user (nearly 60 stake days)
ACTIVE_RATE_LONGTERM Lottery user participates in stake rate (nearly 60 stake days) for a long time
INV_RECENT The short-term stake total value of lottery user (nearly 10 stake days)
DIV_RECENT The short-term handsel total value of lottery user (nearly 10 stake days)
ACTIVE_RATE_RECENT Lottery user participates in stake rate (nearly 10 stake days) in short term
5 customer transaction pattern feature collection of table
Feature Description
INV1 The nearest phase stake total value of lottery user is (if do not bet, then for 0)
INV2 The nearest second phase stake total value of lottery user is (if do not bet, then for 0)
INV3 The nearest third phase stake total value of lottery user is (if do not bet, then for 0)
··· ···
INV60 The nearest 60th phase stake total value of lottery user is (if do not bet, then for 0)
DIV1 The nearest phase handsel total value of lottery user (is infused, then for 0) as not middle
DIV2 The nearest second phase handsel total value of lottery user (is infused, then for 0) as not middle
DIV3 The nearest third phase handsel total value of lottery user (is infused, then for 0) as not middle
··· ···
DIV60 The nearest 60th phase handsel total value of lottery user (is infused, then for 0) as not middle
It should be noted that table 3 listed above to table 5 is only the lottery user liveness prediction side provided using invention The extracted Partial Feature collection of method, the present invention can also generate other feature sets in addition to this according to actual needs, herein not It enumerates.
It is shown in Figure 3, it is one of the model training of a kind of lottery user liveness prediction technique provided by the invention The specific flow chart of preferred embodiment, the method is by step S21 to step S24 to the liveness based on GBDT algorithm Prediction model is trained:
Step S21, the training characteristics collection relevant to user activity being obtained ahead of time is divided into training set and verifying collects;
Step S22, it is modeled based on GBDT algorithm according to the training set, obtains at least two initial predicted models;
Step S23, the accuracy rate of each initial predicted model is obtained respectively according to the verifying collection;
Step S24, select the highest initial predicted model of accuracy rate as the liveness based on GBDT algorithm Prediction model.
Specifically, in modeling, first by the data complete or collected works by pretreatment and Feature Engineering treated training characteristics collection It is divided, for example, being based on time dimension, before the data complete or collected works of training characteristics collection 70% data is formed into training set Train-set, by rear 30% data composition verifying collection Validation-set;Based on GBDT algorithm according to training set Train- Set is modeled, and establishes at least two initial predicted models;It is initial pre- at each according to verifying collection Validation-set Observing and nursing effect on model is surveyed, the corresponding accuracy rate of each model is obtained;It is accurate to select in all initial predicted models The liveness prediction model based on GBDT algorithm that the highest initial predicted model of rate is completed as training.
It should be noted that being predicted in the acquisition methods and above-described embodiment of training characteristics collection used in the present embodiment special The acquisition methods of collection are identical.
It as shown in connection with fig. 4, is one of step S22 of a kind of lottery user liveness prediction technique provided by the invention excellent The specific flow chart of embodiment is selected, it is described to be modeled based on GBDT algorithm according to the training set, it is initial to obtain at least two Prediction model specifically includes step S2201 to step S2202:
Step S2201, different at least two groups modeling parameters are chosen from the training set according to grid data service;
Step S2202, it is modeled based on GBDT algorithm according at least two groups modeling parameters, it is corresponding to obtain at least two A initial predicted model.
Specifically, grid data service (grid can be used since modeling process is related to the selection of many modeling parameters Search one group of optimal parameter) is searched out from the total data of training set, and by repeatedly attempting to select different parameters It is combined, obtains multiple groups modeling parameters, modeled respectively according to multiple groups modeling parameters based on GBDT algorithm, to establish Multiple initial predicted models, wherein one initial predicted model of each group of modeling parameters correspondence establishment.
In a further advantageous embodiment, it is described by predicted characteristics collection input training in advance based on GBDT algorithm In liveness prediction model, user activity is predicted, is specifically included:
The predicted characteristics collection is inputted in the liveness prediction model based on GBDT algorithm, is obtained in preset time User activity scoring in section.
Specifically, after the predicted characteristics collection corresponding to the original user data for extracting lottery user, after training The liveness prediction model based on GBDT algorithm, obtain user of the user in pre-set period (such as following 60 days) Liveness scoring, wherein user activity scoring indicates that user is likely to become the probability of high liveness user.
In another preferred embodiment, the method also includes:
The prediction result of user activity is exported in a text form, and is shown in the form of WEB page.
Specifically, according to based on GBDT algorithm liveness prediction model obtain user activity prediction result it Afterwards, prediction result is provided in the form of text, and is presented in the form of WEB page.
It should be understood that being directed to each lottery user, it is pre- user activity can be carried out through the embodiment of the present invention It surveys, the corresponding prediction result of each user can be shown by WEB page, as user in a few days becomes height in following 60 stakes The probability of liveness user, it is preferable that can also provide and positive sequence is carried out to all users according to user activity scoring or/and is fallen The function of sequence sequence.
In another preferred embodiment, the method also includes:
Obtain actual user's liveness;
The accuracy rate of the liveness prediction model based on GBDT algorithm is verified according to actual user's liveness;
The liveness prediction model based on GBDT algorithm is corrected according to verification result.
It should be noted that actual user's liveness can be obtained from new user data, and upload in the form of text, Therefore, the present invention can verify the accuracy rate of user activity prediction, and be enlivening based on GBDT algorithm according to verification result The successive iterations for spending prediction model, which are improved, provides guidance.
A kind of lottery user liveness prediction technique provided by the embodiment of the present invention, according to the actual user of lottery user Liveness is corrected liveness prediction model, can be further improved the accuracy rate of prediction.
The embodiment of the invention also provides a kind of lottery user liveness forecasting systems, can be realized any of the above-described embodiment In lottery user liveness prediction technique all processes, the skill of effect and the realization of modules, unit in system Art effect is imitated with the technology of effect and realization the step of the lottery user liveness prediction technique in above-described embodiment respectively Fruit corresponds to identical, and which is not described herein again.
It is shown in Figure 5, it is an a kind of preferred embodiment of lottery user liveness forecasting system provided by the invention Structural block diagram, comprising:
Data loading module 11 is extracted and is converted to the original user data for obtaining original user data Afterwards, it is loaded onto database with specified format classification;
Data processing module 12 is obtained for pre-processing to the original user data stored in the database Obtain various dimensions user data;Wherein, the pretreatment includes at least consistency treatment, except processing, data transformation and data are returned again About handle;The various dimensions user data includes at least the personal information of user, history stake information and history earnings information;
Characteristic extracting module 13, it is special for obtaining prediction relevant to user activity according to the various dimensions user data Collection;And
Liveness prediction module 14, the enlivening based on GBDT algorithm for training predicted characteristics collection input in advance It spends in prediction model, user activity is predicted.
Preferably, the characteristic extracting module specifically includes:
Potential feature extraction unit, for being constructed from the various dimensions user data according to data statistic analysis and user The relevant potential feature set of liveness;And
Predicted characteristics extraction unit, for the potential feature set to be adjusted, screens and is combined according to iteration tests, Obtain the predicted characteristics collection.
Preferably, the lottery user liveness forecasting system further include:
Feature set division module, the training characteristics collection relevant to user activity for that will be obtained ahead of time are divided into training Collection and verifying collection;
Initial predicted model building module is obtained at least for being modeled based on GBDT algorithm according to the training set Two initial predicted models;
Initial predicted model authentication module, for obtaining each initial predicted model respectively according to verifying collection Accuracy rate;And
Prediction model determining module, for selecting the highest initial predicted model of accuracy rate as described based on GBDT The liveness prediction model of algorithm.
Preferably, the initial predicted model building module specifically includes:
Modeling parameters acquiring unit is built for choosing different at least two groups from the training set according to grid data service Mould parameter;And
Initial predicted model foundation unit, for being built based on GBDT algorithm according at least two groups modeling parameters Mould, it is corresponding to obtain at least two initial predicted models.
Preferably, the liveness prediction module specifically includes:
Liveness scoring acquiring unit, it is pre- for the predicted characteristics collection to be inputted the liveness based on GBDT algorithm It surveys in model, obtains user activity scoring within a preset period of time.
Preferably, the lottery user liveness forecasting system further include:
Prediction result display module, for exporting the prediction result of user activity in a text form, and with WEB pages The form in face is shown.
As an improvement of the above scheme, it can intuitively be mentioned to user in a manner of visual by the way that user interface is arranged For system function and prediction result, user information reading is realized.For example, system can pass through instrument board (Dashboard), Xiang Ye Business personnel provide intuitive statistical graph, the recent basic statistics information of reflection user group, so that the behavior for holding group becomes Gesture.Specific statistical graph includes but is not limited to line chart, the Yong Hunian of nearest 10 stakes day user activity year-on-year growth rate Age distribution histogram, user bet channel distribution histogram, user activity scoring distribution.
Preferably, the lottery user liveness forecasting system further include:
Actual user's liveness obtains module, for obtaining actual user's liveness;
Prediction model accuracy rate authentication module, for being based on GBDT algorithm according to actual user's liveness verifying is described Liveness prediction model accuracy rate;And
Prediction model correction module, for according to verification result to it is described based on the liveness prediction model of GBDT algorithm into Row correction.
It is shown in Figure 6 the embodiment of the invention also provides a kind of terminal device, it is that a kind of terminal provided by the invention is set The structural block diagram of a standby preferred embodiment, including processor 10, memory 20 and be stored in the memory 20 and It is configured as the computer program executed by the processor 10, the processor 10 is realized when executing the computer program Lottery user liveness prediction technique described in any of the above-described embodiment.
Preferably, the computer program can be divided into one or more module/units (such as computer program 1, meter Calculation machine program 2), one or more of module/units are stored in the memory 20, and by The processor 10 executes, to complete the present invention.One or more of module/units, which can be, can complete specific function Series of computation machine program instruction section, the instruction segment is for describing execution of the computer program in the terminal device Journey.
The processor 10 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc., general processor can be microprocessor or the processor 10 is also possible to any conventional place Device is managed, the processor 10 is the control centre of the terminal device, utilizes terminal device described in various interfaces and connection Various pieces.
The memory 20 mainly includes program storage area and data storage area, wherein program storage area can store operation Application program needed for system, at least one function etc., data storage area can store related data etc..In addition, the memory 20 can be high-speed random access memory, can also be nonvolatile memory, such as plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card and flash card (Flash Card) etc., or The memory 20 is also possible to other volatile solid-state parts.
It should be noted that above-mentioned terminal device may include, but it is not limited only to, processor, memory, those skilled in the art Member is appreciated that Fig. 6 structural block diagram is only the example of above-mentioned terminal device, does not constitute the restriction to above-mentioned terminal device, It may include perhaps combining certain components or different components than illustrating more or fewer components.
The embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium includes The computer program of storage;Wherein, where the computer program controls the computer readable storage medium at runtime Equipment executes lottery user liveness prediction technique described in any of the above-described embodiment.
To sum up, a kind of lottery user liveness prediction technique, system provided by the embodiment of the present invention and terminal device, meter Calculation machine readable storage medium storing program for executing, using the various dimensions user data of lottery user, such as personal information, historical trading and earning performance Deng, using GBDT algorithm identification user become high liveness user fixed mode and special behavior, potential predictability user at A possibility that for high liveness user, it can reduce the prediction difficulty of lottery user liveness, and improve predictablity rate, from And operation personnel can take corresponding customer relation management measure in time, effectively promote user activity, be betting office Win profit.
The above is only a preferred embodiment of the present invention, it is noted that for the ordinary skill people of the art For member, without departing from the technical principles of the invention, several improvement and deformations can also be made, these improvement and deformations Also it should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of lottery user liveness prediction technique characterized by comprising
Original user data is obtained, after the original user data is extracted and converted, is loaded into specified format classification In database;
The original user data stored in the database is pre-processed, various dimensions user data is obtained;Wherein, institute Pretreatment is stated to include at least consistency treatment, remove processing again, data transformation and data reduction process;The various dimensions user data Information and history earnings information are bet including at least personal information, the history of user;
Predicted characteristics collection relevant to user activity is obtained according to the various dimensions user data;
In the liveness prediction model based on GBDT algorithm that predicted characteristics collection input is trained in advance, to user activity It is predicted.
2. lottery user liveness prediction technique as described in claim 1, which is characterized in that described to be used according to the various dimensions User data obtains predicted characteristics collection relevant to user activity, specifically includes:
Potential feature set relevant to user activity is constructed from the various dimensions user data according to data statistic analysis;
The potential feature set is adjusted, screened and combined according to iteration tests, obtains the predicted characteristics collection.
3. lottery user liveness prediction technique as described in claim 1, which is characterized in that the method passes through following steps The liveness prediction model based on GBDT algorithm is trained:
The training characteristics collection relevant to user activity being obtained ahead of time is divided into training set and verifying collection;
It is modeled based on GBDT algorithm according to the training set, obtains at least two initial predicted models;
Obtain the accuracy rate of each initial predicted model respectively according to the verifying collection;
Select the highest initial predicted model of accuracy rate as the liveness prediction model based on GBDT algorithm.
4. lottery user liveness prediction technique as claimed in claim 2, which is characterized in that it is described based on GBDT algorithm according to The training set is modeled, and is obtained at least two initial predicted models, is specifically included:
Different at least two groups modeling parameters are chosen from the training set according to grid data service;
It is modeled based on GBDT algorithm according at least two groups modeling parameters, it is corresponding to obtain at least two initial predicteds Model.
5. lottery user liveness prediction technique as described in claim 1, which is characterized in that described by the predicted characteristics collection Input in the liveness prediction model based on GBDT algorithm of training, is predicted user activity, is specifically included in advance:
The predicted characteristics collection is inputted in the liveness prediction model based on GBDT algorithm, is obtained within a preset period of time User activity scoring.
6. lottery user liveness prediction technique as described in claim 1, which is characterized in that the method also includes:
The prediction result of user activity is exported in a text form, and is shown in the form of WEB page.
7. lottery user liveness prediction technique as described in claim 1, which is characterized in that the method also includes:
Obtain actual user's liveness;
The accuracy rate of the liveness prediction model based on GBDT algorithm is verified according to actual user's liveness;
The liveness prediction model based on GBDT algorithm is corrected according to verification result.
8. a kind of lottery user liveness forecasting system characterized by comprising
Data loading module, for obtaining original user data, after the original user data is extracted and converted, to refer to Determine format classification to be loaded onto database;
Data processing module obtains multidimensional for pre-processing to the original user data stored in the database Spend user data;Wherein, the pretreatment includes at least consistency treatment, removes processing again, data transformation and data reduction process; The various dimensions user data includes at least the personal information of user, history stake information and history earnings information;
Characteristic extracting module, for obtaining predicted characteristics collection relevant to user activity according to the various dimensions user data; And
Liveness prediction module, for predicting the liveness based on GBDT algorithm of predicted characteristics collection input training in advance In model, user activity is predicted.
9. a kind of terminal device, which is characterized in that including processor, memory and store in the memory and be configured For the computer program executed by the processor, the processor realizes such as claim when executing the computer program Lottery user liveness prediction technique described in any one of 1 to 7.
10. a kind of computer readable storage medium, which is characterized in that the computer readable storage medium includes the calculating of storage Machine program;Wherein, the equipment where the computer program controls the computer readable storage medium at runtime executes such as Lottery user liveness prediction technique described in any one of claims 1 to 7.
CN201810840895.8A 2018-07-26 2018-07-26 Lottery user liveness prediction technique, system and terminal device, storage medium Pending CN109508807A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810840895.8A CN109508807A (en) 2018-07-26 2018-07-26 Lottery user liveness prediction technique, system and terminal device, storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810840895.8A CN109508807A (en) 2018-07-26 2018-07-26 Lottery user liveness prediction technique, system and terminal device, storage medium

Publications (1)

Publication Number Publication Date
CN109508807A true CN109508807A (en) 2019-03-22

Family

ID=65745483

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810840895.8A Pending CN109508807A (en) 2018-07-26 2018-07-26 Lottery user liveness prediction technique, system and terminal device, storage medium

Country Status (1)

Country Link
CN (1) CN109508807A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428930A (en) * 2020-03-24 2020-07-17 中电药明数据科技(成都)有限公司 GBDT-based medicine patient using number prediction method and system
CN111967521A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Cross-border active user identification method and device
CN112015726A (en) * 2020-08-21 2020-12-01 广州欢网科技有限责任公司 User activity prediction method, system and readable storage medium
CN112612826A (en) * 2020-12-21 2021-04-06 北京达佳互联信息技术有限公司 Data processing method and device
CN115858719A (en) * 2023-02-21 2023-03-28 四川邕合科技有限公司 SIM card activity prediction method and system based on big data analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250403A (en) * 2016-07-19 2016-12-21 北京奇艺世纪科技有限公司 Customer loss Forecasting Methodology and device
CN106997493A (en) * 2017-02-14 2017-08-01 云数信息科技(深圳)有限公司 Lottery user attrition prediction method and its system based on multi-dimensional data
CN107633326A (en) * 2017-09-14 2018-01-26 北京拉勾科技有限公司 A kind of user delivers the construction method and computing device of wish model
US20180129971A1 (en) * 2016-11-10 2018-05-10 Adobe Systems Incorporated Learning user preferences using sequential user behavior data to predict user behavior and provide recommendations

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106250403A (en) * 2016-07-19 2016-12-21 北京奇艺世纪科技有限公司 Customer loss Forecasting Methodology and device
US20180129971A1 (en) * 2016-11-10 2018-05-10 Adobe Systems Incorporated Learning user preferences using sequential user behavior data to predict user behavior and provide recommendations
CN106997493A (en) * 2017-02-14 2017-08-01 云数信息科技(深圳)有限公司 Lottery user attrition prediction method and its system based on multi-dimensional data
CN107633326A (en) * 2017-09-14 2018-01-26 北京拉勾科技有限公司 A kind of user delivers the construction method and computing device of wish model

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428930A (en) * 2020-03-24 2020-07-17 中电药明数据科技(成都)有限公司 GBDT-based medicine patient using number prediction method and system
CN111967521A (en) * 2020-08-18 2020-11-20 中国银行股份有限公司 Cross-border active user identification method and device
CN111967521B (en) * 2020-08-18 2023-09-19 中国银行股份有限公司 Cross-border active user identification method and device
CN112015726A (en) * 2020-08-21 2020-12-01 广州欢网科技有限责任公司 User activity prediction method, system and readable storage medium
CN112015726B (en) * 2020-08-21 2024-04-12 广州欢网科技有限责任公司 User activity prediction method, system and readable storage medium
CN112612826A (en) * 2020-12-21 2021-04-06 北京达佳互联信息技术有限公司 Data processing method and device
CN112612826B (en) * 2020-12-21 2024-02-06 北京达佳互联信息技术有限公司 Data processing method and device
CN115858719A (en) * 2023-02-21 2023-03-28 四川邕合科技有限公司 SIM card activity prediction method and system based on big data analysis
CN115858719B (en) * 2023-02-21 2023-05-23 四川邕合科技有限公司 Big data analysis-based SIM card activity prediction method and system

Similar Documents

Publication Publication Date Title
CN109508807A (en) Lottery user liveness prediction technique, system and terminal device, storage medium
CN110717828B (en) Abnormal account detection method and system based on frequent transaction mode
Whittle Probability via expectation
CN106997493A (en) Lottery user attrition prediction method and its system based on multi-dimensional data
CN103370722B (en) The system and method that actual volatility is predicted by small echo and nonlinear kinetics
CN112800053B (en) Data model generation method, data model calling device, data model equipment and storage medium
Yan et al. Detection of crashes and rebounds in major equity markets
CN108573358A (en) A kind of overdue prediction model generation method and terminal device
CN113342939B (en) Data quality monitoring method and device and related equipment
CN111986027A (en) Abnormal transaction processing method and device based on artificial intelligence
Iannaccone Reassessing church growth: Statistical pitfalls and their consequences
Glauner Artificial intelligence for the detection of electricity theft and irregular power usage in emerging markets
CN110400213A (en) Data processing method and device and electronic equipment and readable medium
CN111325572B (en) Data processing method and device
CN116843483A (en) Vehicle insurance claim settlement method, device, computer equipment and storage medium
CN109146549A (en) Lottery user product participation prediction technique, system and equipment, storage medium
Kocakoç et al. Exploring decision rules for election results by classification trees
Barron A stochastic card balance management problem with continuous and batch-type bilateral transactions
CN114662794A (en) Enterprise default risk prediction method, device, equipment and storage medium
Bharathy et al. Applications of social systems modeling to political risk management
Baucks et al. Simulating Policy Changes in Prerequisite-Free Curricula: A Supervised Data-Driven Approach.
Lewis et al. Bayesian estimation of trend components within markovian regime-switching models for wholesale electricity prices: An application to the south australian wholesale electricity market
CN111160929A (en) Method and device for determining client type
Chang et al. Applying Decision Tree to Detect Credit Card Fraud
Villas-Boas et al. Measuring The Inequality Nature Of European Micro Income Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190322

RJ01 Rejection of invention patent application after publication