CN109767269A - A kind for the treatment of method and apparatus of game data - Google Patents
A kind for the treatment of method and apparatus of game data Download PDFInfo
- Publication number
- CN109767269A CN109767269A CN201910037504.3A CN201910037504A CN109767269A CN 109767269 A CN109767269 A CN 109767269A CN 201910037504 A CN201910037504 A CN 201910037504A CN 109767269 A CN109767269 A CN 109767269A
- Authority
- CN
- China
- Prior art keywords
- data
- decision
- information
- user group
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The embodiment of the invention provides a kind of processing methods of game data, comprising: obtains games log data;Game user is divided according to the games log data, obtains at least one user group's information;The corresponding characteristic dimension data of user group's information are extracted from the games log data;User group's information and corresponding characteristic dimension data are input to decision-tree model, decision-tree model and corresponding model file after being trained;Extract the decision path data in the model file;Decision Tree algorithms can preferentially be picked out in multidimensional characteristic differentiates that the strongest feature of performance is placed on close to the position of root node, so that researcher be helped to analyze Drain Causes importance;Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.
Description
Technical field
The present invention relates to game technical fields, processing method and a kind of game data more particularly to a kind of game data
Processing unit.
Background technique
In game company, customer churn is always one of to make, plan, run relevant departments the most concern,
The quantity and consumption dynamics of user is to influence the important evidence in development of games direction, migration efficiency and subsequent popularization funds.It is right
In current mainstream to the additional content charge stronger game on line of dependence, the cost cost for retaining an old user is about obtained
1/5 spent needed for a new user is obtained, it is also contemplated that a possibility that being lost high consumption user and new user develop into advanced use
The cost at family, profit variance will also be increased further.Therefore, customer churn reason is analyzed, it is thus understood that the game of user
It experiences and pointedly proposes evolutionary approach, the retention amount of game user can be improved, enhance game playability, promote business valence
Value.
Existing customer churn analysis of causes method mainly includes user's investigation and the numerical analysis based on statistics;User's tune
It grinds and is sampled by being lost user to part, randomly select certain customers and be investigated, investigation form is more various, common
Form has questionnaire survey and telephone questionnaire etc., can intuitively obtain customer churn reason in this way.Based on statistics
It is for statistical analysis to the games log data of user that numerical analysis specifically refers to operation department, extracts and is lost from database
The information such as rate, retention ratio, online hours, task quantity performed, and Drain Causes are guessed and analyzed.Common method has
Regression analysis, funnel analytic approach, feedback investigation method etc..
But for the method for user's investigation, low efficiency, human cost is high, and finding do not have generality and
Versatility.And the prediction result subjectivity of the numerical analysis method based on statistics is strong, to operation department's correlation predictive staff
Correlation experience require high, and can not differentiate the relative importance of multiple features, equally exist that low efficiency, human cost are high to ask
Topic.
Summary of the invention
The embodiment of the present invention provides the processing method and a kind of corresponding processing unit of game data of a kind of game data.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of processing methods of game data, comprising:
Obtain games log data;
Game user is divided according to the games log data, obtains at least one user group's information;
The corresponding characteristic dimension data of user group's information are extracted from the games log data;
User group's information and corresponding characteristic dimension data are input to decision-tree model, determining after being trained
Plan tree-model and corresponding model file;
Extract the decision path data in the model file.
Preferably, further includes:
It is visualized for the decision-tree model after the training, obtains model visualization result.
Preferably, described that game user is divided according to the games log data, obtain at least one user group
The step of body information includes:
Extract the accumulative online hours in the preset time period of the games log data;
Game user is divided according to the accumulative online hours, obtains user group's information.
Preferably, described that the corresponding feature dimensions degree of user group's information is extracted from the games log data
According to the step of include:
The corresponding original dimensions data of user group's information are extracted from the games log data;
Extract the characteristic dimension data in the original dimensions data.
Preferably, described that user group's information and corresponding characteristic dimension data are input to decision-tree model, it obtains
The step of decision-tree model and corresponding model file after must training includes:
User group's information and corresponding characteristic dimension data are input to decision-tree model, determining after being trained
Plan tree-model;
Decision-tree model after the training is parsed, the model file is obtained.
Preferably, before the step of decision path data extracted in the model file, further includes:
It is screened for the decision path data in the model file, the decision path data after being screened.
Preferably, the step of decision path data extracted in the model file include:
Decision path data after extracting the screening.
The embodiment of the invention also discloses a kind of processing units of game data, comprising:
Games log data acquisition module, for obtaining games log data;
User group's information acquisition module is obtained for dividing according to the games log data to game user
At least one user group's information;
Characteristic dimension data extraction module, for extracting user group's information pair from the games log data
The characteristic dimension data answered;
Training module, for user group's information and corresponding characteristic dimension data to be input to decision-tree model,
Decision-tree model and corresponding model file after being trained;
Decision path data extraction module, for extracting the decision path data in the model file.
The embodiment of the invention also discloses a kind of electronic equipment, including memory, processor and storage are on a memory simultaneously
The computer program that can be run on a processor, the processor realize the processing of above-mentioned game data when executing described program
The step of.
The embodiment of the invention also discloses a kind of computer readable storage medium, deposited on the computer readable storage medium
Computer program is contained, the computer program realizes the processing of above-mentioned game data when being executed by processor the step of.
The embodiment of the present invention includes following advantages:
In the embodiment of the present invention, games log data are obtained;Game user is drawn according to the games log data
Point, obtain at least one user group's information;It is corresponding that user group's information is extracted from the games log data
Characteristic dimension data;User group's information and corresponding characteristic dimension data are input to decision-tree model, trained
Decision-tree model and corresponding model file afterwards;Extract the decision path data in the model file;Output is described certainly
Plan path data;Analyze the important behaviour behavior of customer churn from the user behaviors log of game user, and to these behaviors into
Row importance ranking embodies the high efficiency and scalability of the program;Decision Tree algorithms can preferentially be selected in multidimensional characteristic
Differentiate that the strongest feature of performance is placed on close to the position of root node out, so that researcher be helped to analyze Drain Causes importance;
Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing;
Fig. 1 is a kind of step flow chart of the processing method embodiment one of game data of the embodiment of the present invention;
Fig. 2 is a kind of step flow chart of the processing method embodiment two of game data of the embodiment of the present invention;
Fig. 3 is a kind of structural block diagram of the processing device embodiment of game data of the embodiment of the present invention.
Specific embodiment
The technical issues of in order to keep the embodiment of the present invention solved, technical solution and beneficial effect are more clearly understood, with
The embodiment of the present invention is further described in lower combination accompanying drawings and embodiments.It should be appreciated that specific implementation described herein
Example is only used to explain the present invention, is not intended to limit the present invention.
Referring to Fig.1, a kind of step process of the processing method embodiment one of game data of the embodiment of the present invention is shown
Figure, can specifically include following steps:
Step 101, games log data are obtained;
In the concrete realization, the embodiment of the present invention can be applied in the terminal, for example, mobile phone, tablet computer, individual
Digital assistants, wearable device (such as glasses, wrist-watch) and desktop computer etc..
In embodiments of the present invention, the operating system of mobile terminal may include Android (Android), IOS, Windows
Phone, Windows etc..
In another preferred embodiment of the embodiment of the present invention, the embodiment of the present invention can also be applied in server, should
Server may include server may include device in PC (Personal Computer, PC) service, it is mainframe, small
Type machine, can also include Cloud Server, and the embodiment of the present invention does not limit the type and quantity of server specifically.
Specifically, the number when games log data may include game application operation about game user behavior
According to;The games log data may include all kinds of log fields, timestamp, match information, Transaction Information, game information etc., sheet
Inventive embodiments to this with no restriction.
It is further applicable in the embodiment of the present invention, which can get the game from game server
Daily record data, i.e. the games log data can be stored in one or more game servers, and mobile terminal can pass through net
Network is connect with the game server, gets the games log data by network.
When the embodiment of the present invention is applied to when the server, the server may include game server itself,
The game server can call default process to obtain the games log data for being stored in memory, execute following related
Data handling procedure;It is following to be illustrated by taking mobile terminal as an example.
Step 102, game user is divided according to the games log data, obtains at least one user group letter
Breath;
Be specifically applied in the embodiment of the present invention, the mobile terminal can according to the games log data to game user into
Row divides, and obtains at least one user group's information;Specifically, which can extract the games log number first
Accumulative online hours data in divide game user according to the accumulative online hours data, obtain at least one
A user group's information.
It should be noted that the accumulative online hours data can be in the preset time period after user's creation game account
Time duration, the time duration in 24 hours after game account can be created for user such as accumulative online hours data.
For example, when the accumulative online hours data of the game user are no less than some preset time period, Ke Yibiao
Remember that the game user to retain user, otherwise, marks the game user to be lost user;, and accumulative online hours data can be with
For not less than 5 minutes, not less than 10 minutes, be not less than 30 minutes;Correspondingly, which can be divided into 5 minutes
It retains user group, 5 minutes loss user groups, 10 minutes retention user groups, 10 minutes loss user groups, stay within 30 minutes
Deposit user group and 30 minutes loss user groups etc..
Further, whether mobile terminal can have login behavior in the next day of creation game account according to game user,
If there is login behavior in next day, marking the game user is next day to retain user group, and otherwise marking the game user is next day
It is lost user group.
In a kind of preferred embodiment of the embodiment of the present invention, the games log data can also include purchase stage property quantity,
It buys the stage property amount of money and game user grade is promoted and expends time etc., mobile terminal can be according to above-mentioned games log data needle
Different user groups is divided into game user.
In another specific example of the embodiment of the present invention, the purchase stage property quantity in the games log data can be
0 kind, not less than 2 kinds, not less than 5 kinds, not less than 8 kinds, not less than 10 kinds, be not less than 15 kinds etc., according to purchase stage property quantity will
Game user carries out the division of user group, obtains different retention user groups and is lost user group.
In another specific example of the embodiment of the present invention, the game user grade in the games log data promotes consumption
It is time-consuming can for not less than 1 hour, not less than 6 hours, not less than 12 hours, not less than 24 hours, not less than 48 hours, no
Less than 96 hours etc., is promoted according to game user grade and expend the time for the division of game user progress user group, obtained not
Same retention user group and loss user group.
Above-mentioned user group divides several citings of the only embodiment of the present invention, can also pass through games log data
In other data for it is described by game user carry out user group division, obtain different retention user group and loss
User group, the embodiment of the present invention to this with no restriction.
Step 103, the corresponding characteristic dimension data of user group's information are extracted from the games log data;
It applies in the embodiment of the present invention, mobile terminal can extract the use from the games log data
The corresponding characteristic dimension data of family community information;It should be noted that the games log data may include multiple original dimensions
Data extract the characteristic dimension data from the original dimensions data;For example, which can wrap
Include login times, online hours, virtual item total amount etc., the embodiment of the present invention to this with no restriction.
It should be noted that may include the corresponding mark (ID) of multiple game users in each user group's information;Institute
The ID for stating game user is associated with the games log data, and the games log data may include multiple original dimensions
Data;And the original dimensions data are the statistic or temporal characteristics of each game user;Then this feature dimension data is to pass through
The statistic or temporal characteristics of each game user after screening.
Because the ID of the game user is associated with the games log data, user group's information can with it is initial
Dimension data has mapping relations, i.e. user group's information can also have mapping relations with characteristic dimension data.
For example, which can be 5 minutes retention user groups, described 5 minutes retention user groups
Corresponding characteristic dimension data may include: that daily task completes number, same day highest empirical value, new hand's guidance duration
Deng.
The embodiment of the present invention does not limit the type of the characteristic dimension data in the games log data specifically.
Step 104, user group's information and corresponding characteristic dimension data are input to decision-tree model, are instructed
Decision-tree model and corresponding model file after white silk;
It is further applicable in the embodiment of the present invention, obtains user group's information and corresponding characteristic dimension data
Afterwards, mobile terminal can be input to decision-tree model, decision-tree model and corresponding model file after being trained.
It should be noted that the type of the decision-tree model may include ID3 (Iterative Dichotomiser, repeatedly
For riffle) model, C4.5 model and CART (Classification and Regression Tree, classification and regression tree)
Model etc., the embodiment of the present invention to this with no restriction.
Using user group's information and corresponding characteristic dimension data as sample, instructed for the decision-tree model
Practice, the decision-tree model after being trained.
For example, mobile terminal can by 5 minutes above-mentioned retention user groups, 5 minutes loss user groups, stay within 10 minutes
Deposit user group, 10 minutes loss user groups, 30 minutes retention user groups and 30 minutes loss user groups and its correspondence
Characteristic dimension data be input in above-mentioned CART model, the CART model after being trained.
After decision-tree model after obtaining training, parsed for the decision-tree model after the training, and store
File for more another specific formats is the model file, and for example, the file of the specific format can be json lattice
Formula file.
Step 105, the decision path data in the model file are extracted.
It applies in the embodiment of the present invention, which can extract the decision path in the model file
Data;Specifically, can be parsed for the model file, a large amount of decision path data are obtained, certain sieves are set
Condition is selected to carry out screening or beta pruning for a large amount of decision path data.
It should be noted that including multiple nodes in every decision path data, each node indicates that some is determined
Plan rule, i.e., every decision path data are made of multiple decision rules.
Preferably, the method can also include: the output decision path data.
It is specifically applied in the embodiment of the present invention, after obtaining the decision path data, which can be with more
The file of kind format exports the decision path data for example can export the decision path number in table form
According to, the embodiment of the present invention to the output format of decision path data with no restriction.
In the embodiment of the present invention, games log data are obtained;Game user is drawn according to the games log data
Point, obtain at least one user group's information;It is corresponding that user group's information is extracted from the games log data
Characteristic dimension data;User group's information and corresponding characteristic dimension data are input to decision-tree model, trained
Decision-tree model and corresponding model file afterwards;Extract the decision path data in the model file;Output is described certainly
Plan path data;Analyze the important behaviour behavior of customer churn from the user behaviors log of game user, and to these behaviors into
Row importance ranking embodies the high efficiency and scalability of the program;Decision Tree algorithms can preferentially be selected in multidimensional characteristic
Differentiate that the strongest feature of performance is placed on close to the position of root node out, so that researcher be helped to analyze Drain Causes importance;
Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.
Referring to Fig. 2, a kind of step process of the processing method embodiment two of game data of the embodiment of the present invention is shown
Figure, can specifically include following steps:
Step 201, games log data are obtained;
The embodiment of the present invention can be applied in mobile terminal or server, which may include all kinds of days
Will field, timestamp, match information, Transaction Information, game information etc., the mobile terminal can be got from game server
The games log data, i.e. the games log data can be stored in game server, and mobile terminal can pass through network
Get the games log data.
When the embodiment of the present invention is applied to when the server, the server may include game server itself,
The game server can call default process to obtain the games log data for being stored in memory, following with mobile terminal
For be illustrated.
Step 202, the accumulative online hours in the preset time period of the games log data are extracted;
Specifically, the mobile terminal can identify being accumulated in the preset time period of the games log data
Line duration;For example, the scheduled time section can be 24 hours or 48 hours, calculating described is to swim in 24 hours or 48 hours
The accumulative online hours of play user.
Step 203, game user is divided according to the accumulative online hours, obtains user group's information;
Further, which can divide game user according to accumulative online hours, obtain the use
Family community information;
For example, add up online hours data can for not less than 5 minutes, not less than 10 minutes, be not less than 30 minutes
Deng, then user group's information can be divided into 5 minutes retention user groups, 5 minutes loss user groups, 10 minutes retain
User group, 10 minutes loss user groups, 30 minutes retention user groups and 30 minutes loss user groups.
Step 204, the corresponding original dimensions data of user group's information are extracted from the games log data;
It is specifically applied in the embodiment of the present invention, there is mapping to close for user group's information and the original dimensions data
System, it is corresponding that mobile terminal can extract user group's information from the games log data according to the mapping relations
Original dimensions data.
Step 205, the characteristic dimension data in the original dimensions data are extracted;
It is further applicable in the embodiment of the present invention, mobile terminal can extract the feature from original dimensions data
Dimension data;Specifically, mobile terminal receives certain preset threshold values, the original dimensions data are extracted according to the threshold value
In characteristic dimension data.
Step 206, user group's information and corresponding characteristic dimension data are input to decision-tree model, are instructed
Decision-tree model after white silk;
It should be noted that the type of the decision-tree model may include ID3 (Iterative Dichotomiser, repeatedly
For riffle) model, C4.5 model and CART (Classification and Regression Tree, classification and regression tree)
Model etc., the embodiment of the present invention to this with no restriction.
Using user group's information and corresponding characteristic dimension data as sample, instructed for the decision-tree model
Practice, the decision-tree model after being trained.
Step 207, the decision-tree model after the training is parsed, obtains the model file;
In a kind of specific example of the embodiment of the present invention, after decision-tree model after obtaining training, for the training
Decision-tree model afterwards is parsed, and the file for being stored as certain specific formats is the model file, for example, institute
The file for stating specific format can be json formatted file.
In a kind of preferred embodiment of the embodiment of the present invention, further includes: carried out for the decision-tree model after the training
Visualization obtains model visualization result.
Step 208, it is screened for the decision path data in the model file, the decision path after being screened
Data;
Specifically, the certain screening conditions of setting can be directed to a large amount of decision path data by mobile terminal
It is screened, for example, the screening conditions may include loss number of users accounting or retention number of users on certain node
Amount accounting reaches first threshold (such as 0.85);Alternatively, the loss number of users accounting on certain node occurs compared to a upper node
Total number of persons in steep increasing and the node is greater than second threshold.
It should be noted that above-mentioned screening conditions are only several citings of the embodiment of the present invention, the embodiment of the present invention
Screening conditions are not limited specifically.
In a kind of preferred embodiment of the embodiment of the present invention, beta pruning behaviour can also be carried out for the decision path data
Make,
Step 209, the decision path data are exported.
In the embodiment of the present invention, after obtaining the decision path data, which can text in a variety of formats
Part exports the decision path data for example can export the decision path data, for example, can in table form
Exporting the decision path data into the table of xls or xlsx format.
In the embodiment of the present invention, games log data are obtained;In the preset time period for extracting the games log data
Accumulative online hours;Game user is divided according to the accumulative online hours, obtains user group's information;From
The games log data extract the corresponding original dimensions data of user group's information;Extract the original dimensions number
Characteristic dimension data in;User group's information and corresponding characteristic dimension data are input to decision-tree model, obtained
Decision-tree model after must training;Decision-tree model after the training is parsed, the model file is obtained;For institute
The decision path data stated in model file are screened, the decision path data after being screened;Export the decision path
Data;The important behaviour behavior of customer churn is analyzed from the user behaviors log of game user, and these behaviors is carried out important
Property sequence, embody the high efficiency and scalability of the program;Decision Tree algorithms can preferentially pick out differentiation in multidimensional characteristic
The strongest feature of performance is placed on close to the position of root node, so that researcher be helped to analyze Drain Causes importance;Compared to
Existing method, greatly reduces human cost, improves efficiency, enhances confidence level;To the timely of the behavior of game user
Understand and analyze, and therefrom generate Drain Causes, game developer can be made to understand the shortcoming inside game in time, can also be made
Development of games portion makes the content of game and timely adjusts, and improves game to the attraction and commercial value of user.
In order to make those skilled in the art more fully understand the embodiment of the present invention, it is illustrated with a specific example.
Step 1: data acquisition
Games log data of the game user since the creation game account are extracted, wherein games log data include each
Class log field, timestamp and relevant detailed log information, are convenient for further data processing.
Step 2: group divides
It is illustrated by taking different losing issues as an example, for different losing issues, user can be divided into loss
Group and retention two class of group.Such as 5 minutes losing issues, 24 since creating game account in counting user log are small
When in accumulative online hours, if accumulative online hours are not less than 5 minutes, which can be marked as retaining user;Otherwise it marks
It is denoted as and is lost user.10 minutes loss user groups, 10 minutes retention user groups, 30 minutes loss user groups can similarly be obtained
With 30 minutes retention user groups.For next day losing issue, if newly-built user logs in behavior in creation game account next day,
The user is then divided into next day retention group, the user is otherwise divided into next day loss group.
Step 3: Feature Engineering
For 5 minutes losing issues, 10 minutes losing issues and 30 minutes losing issues, corresponding loss can be extracted
User group create whole games log data in game account 24 hours and retain user at accumulative 5 minutes, 10 minutes and
Games log data within the scope of 30 minutes carry out Feature Engineering, i.e., from the original dimensions in the games log data
Data extract characteristic dimension data.For next day losing issue, extracts and be lost user group and retention user group's creation
Whole games log data on the game account same day.After obtained games log data, we are extracted in games log data
Characteristic dimension data, this feature dimension data may include the statistics such as match information, Transaction Information, the game information of user
And temporal characteristics, totally 60 dimension, specific as shown in table 1.
Table 1: the characteristic dimension tables of data of the embodiment of the present invention
Step 4: attrition prediction
In order to carry out Causative analysis on going away, need first to carry out attrition prediction, using CART model to the games log of user
Entrained information is analyzed in data, and whether prediction user can be lost.CART model algorithm is by the life of feature selecting, tree
At and beta pruning composition the decision Tree algorithms that can be used for classifying and returning, which passes through meter when solving regression problem
It calculates gini index (GINI value) and selects optimal characteristics, determine the optimal two-value cut-off of this feature, recursively two points of each features,
It feature space is divided into limited unit, and determines the probability distribution of prediction on these units, that is, input is given
Under the conditions of the conditional probability distribution that exports.Wherein, gini index reflects the confusion degree of data acquisition system.When gini index is bigger
When, current data gets over chaos, and node is more impure, therefore CART selects the attribute for keeping the GINI value of child node small as the side of division
Case.When the algorithm is applied on regression problem, the principle of feature selecting is slightly modified, for divide after sample mean square deviation it
And minimum.
The data set that positive negative sample balances is divided into training set and test set, is fitted to obtain above-mentioned decision tree with training set
Model, and the training for being lost model is completed, i.e., by the use as evaluation index using precision, recall, f1 core
Family community information and corresponding characteristic dimension data are input to decision-tree model, complete the training of decision-tree model.
Step 5: model visualization
Carrying out two classification predictions using decision-tree model is actually that will predict process according to the feature or attribute of data
It is decomposed into the decision of multiple subproblems, to carry out the reasoning task of " dividing and rule ".The regular partition of entire decision-tree model
Mechanism is more huge, and for the ease of carrying out analysis and understanding and delivery, after completing the attrition prediction based on decision-tree model, this is determined
The internal structure of plan tree-model is visualized, model visualization is obtained as a result, to which analysis decision tree-model is being predicted
Decision process in the process.
Step 6: Drain Causes generate
After obtaining above-mentioned decision-tree model, decision-tree model is parsed by parsing leaf node, to obtain from root
Each decision path data and relevant Decision information of the node to leaf node.However, with the intensification of decision-tree model depth,
Number of paths increases index again, and entire decision tree gradually bulky complex is unfavorable for further analyzing.Therefore it needs to decision
Tree-model is parsed, and screening conditions, the number of output in control decision path, to extract height are added in control decision path
Imitate practical characteristic of division information.
After completing attrition prediction model training, the depth capacity of control decision tree-model is needed, and model file is protected
.json structure is saved as, the information of each node in decision-tree model is saved in the model file, such as node serial number, positive and negative sample
The information such as this quantity, Gini index.Each series of data structures is used to save the corresponding information of each node and each node
Text information in this model file is extracted and is saved by corresponding father node and child node information, by decision tree mould
Type is restored from text angle, so as to complete the Reading text of decision-tree model and the task of parsing.
Huge for decision path data bulk after completing the parsing task of decision-tree model, rule system is complicated
The problem of, can one or more preset screening conditions screening and beta pruning are carried out to decision path, at present more it is succinct effectively
Screening conditions have:
1, the loss number of users accounting on certain node or retention number of users accounting reach first threshold (such as 0.85), mention
It takes from root node and takes an examination to a series of decision conditions including the node, be denoted as decision path, then show user group from starting
0.5 after a series of two component selections reach the node, can preferably distinguish and be lost user and retain user.
2, the number of users accounting that is lost on certain node occurs the total number of persons increased suddenly and on the node compared to a upper node
Greater than second threshold, extracts and take an examination from root node to a series of decision conditions including the node, be denoted as decision path, then show
Meet the classification performance of the group of certain condition on this node preferably, i.e., the node retains performance shadow to the loss of this kind of user
Sound is larger.
3, certain node is lost sample accounting greater than retention sample accounting and the retention sample accounting of a upper node is greater than loss
Sample accounting is extracted and is taken an examination from root node to a series of decision conditions including the node, is denoted as decision path, then shows the section
The Rule of judgment of point is on the loss for the user group for meeting certain condition and retains influence greatly.
By screening conditions by after decision-tree model beta pruning, new decision-tree model is reconstructed.From the root of new model
Node r sets out, and finds a series of new leaf nodes and is denoted as L={ l1,l2,...li...ln}.From each leaf node liIt sets out backtracking
One can be obtained by leaf node l to root node riThe path for leading to root node r will preserve in these paths, and by its
In node arranged from shallow to deep according to depth locating for node, finally obtain a series of decision path W={ w1,w2,
...wi,...wnIt is decision tree decision path data after beta pruning.
By the model information after reconstruct, the quantity including user on decision rule represented by each node, each node
Etc. information, output include above-mentioned a plurality of decision path data in each file, each path is in .xlsx file at .xlsx file
In output form it is corresponding as shown in table 2.
Table 2: the example of the decision path data in the embodiment of the present invention
If table 2 embodies the decision rule of certain decision path in decision-tree model, every a line corresponds to a decision rule,
A node i.e. in decision-tree model other than leaf node.The first row of every a line corresponds to decision condition, successively stays later
Deposit quantity, be lost quantity, retain number accounting, be lost number accounting, retain sample totality accounting, be lost sample totality accounting,
This feature retains full dose sample accounting and this feature is lost full dose sample accounting.According to these by screening and the division item simplified
The automatic calculating and comparative analysis of part and statistic, can complete the Drain Causes based on decision tree and automatically generate, significantly simple
The workflow and intensity for having changed artificial analysis, show high efficiency and practicability.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method
It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to
According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should
Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented
Necessary to example.
Referring to Fig. 3, a kind of structural block diagram of the processing device embodiment of game data of the embodiment of the present invention is shown, is had
Body may include following module:
Games log data acquisition module 301, for obtaining games log data;
User group's information acquisition module 302 is obtained for dividing according to the games log data to game user
Obtain at least one user group's information;
Characteristic dimension data extraction module 303, for extracting user group's letter from the games log data
Cease corresponding characteristic dimension data;
Training module 304, for user group's information and corresponding characteristic dimension data to be input to decision tree mould
Type, decision-tree model and corresponding model file after being trained;
Decision path data extraction module 305, for extracting the decision path data in the model file.
Preferably, further includes:
Model visualization result obtains module, for being visualized for the decision-tree model after the training, obtains
Model visualization result.
Preferably, user group's information acquisition module includes:
Add up online hours extracting sub-module, it is accumulative in the preset time period for extracting the games log data
Online hours;
User group's information acquisition submodule is obtained for being divided according to the accumulative online hours to game user
Obtain user group's information.
Preferably, the characteristic dimension data extraction module includes:
Original dimensions data extracting sub-module, for extracting user group's information pair from the games log data
The original dimensions data answered;
Characteristic dimension data extracting sub-module, for extracting the characteristic dimension data in the original dimensions data.
Preferably, the training module includes:
Decision-tree model obtains submodule, for user group's information and corresponding characteristic dimension data to be input to
Decision-tree model, the decision-tree model after being trained;
Model file obtains submodule and obtains the model for parsing the decision-tree model after the training
File.
Preferably, the module being connected with the decision path data extraction module, further includes:
Module is obtained, for being screened for the decision path data in the model file, determining after being screened
Plan path data.
Preferably, the decision path data extraction module includes:
Extracting sub-module, for extracting the decision path data after the screening.
The embodiment of the invention also discloses a kind of electronic equipment, including memory, processor and storage are on a memory simultaneously
The computer program that can be run on a processor, the processor realize the processing of the game data when executing described program
The step of.
The embodiment of the invention also discloses a kind of computer readable storage medium, deposited on the computer readable storage medium
The step of containing computer program, the processing of the game data realized when the computer program is executed by processor.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple
Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with
The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate
Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and
The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can
With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code
The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program
The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions
In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these
Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals
Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices
Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram
The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices
In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet
The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram
The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that
Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus
The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart
And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases
This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as
Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap
Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article
Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited
Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Above to a kind of processing method and a kind of processing unit of game data of game data provided by the present invention, into
It has gone and has been discussed in detail, used herein a specific example illustrates the principle and implementation of the invention, the above implementation
The explanation of example is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology people of this field
Member, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this explanation
Book content should not be construed as limiting the invention.
Claims (10)
1. a kind of processing method of game data characterized by comprising
Obtain games log data;
Game user is divided according to the games log data, obtains at least one user group's information;
The corresponding characteristic dimension data of user group's information are extracted from the games log data;
User group's information and corresponding characteristic dimension data are input to decision-tree model, the decision tree after being trained
Model and corresponding model file;
Extract the decision path data in the model file.
2. the method according to claim 1, wherein further include:
It is visualized for the decision-tree model after the training, obtains model visualization result.
3. method according to claim 1 or 2, which is characterized in that described to use according to the games log data game
The step of family is divided, at least one user group's information is obtained include:
Extract the accumulative online hours in the preset time period of the games log data;
Game user is divided according to the accumulative online hours, obtains user group's information.
4. method according to claim 1 or 2, which is characterized in that described to extract institute from the games log data
The step of stating user group's information corresponding characteristic dimension data include:
The corresponding original dimensions data of user group's information are extracted from the games log data;
Extract the characteristic dimension data in the original dimensions data.
5. method according to claim 1 or 2, which is characterized in that described by user group's information and corresponding spy
The step of sign dimension data is input to decision-tree model, decision-tree model and corresponding model file after being trained include:
User group's information and corresponding characteristic dimension data are input to decision-tree model, the decision tree after being trained
Model;
Decision-tree model after the training is parsed, the model file is obtained.
6. according to the method described in claim 5, it is characterized in that, the decision path number extracted in the model file
According to the step of before, further includes:
It is screened for the decision path data in the model file, the decision path data after being screened.
7. according to the method described in claim 6, it is characterized in that, the decision path number extracted in the model file
According to the step of include:
Decision path data after extracting the screening.
8. a kind of processing unit of game data characterized by comprising
Games log data acquisition module, for obtaining games log data;
User group's information acquisition module obtains at least for dividing according to the games log data to game user
One user group's information;
Characteristic dimension data extraction module, it is corresponding for extracting user group's information from the games log data
Characteristic dimension data;
Training module is obtained for user group's information and corresponding characteristic dimension data to be input to decision-tree model
Decision-tree model and corresponding model file after training;
Decision path data extraction module, for extracting the decision path data in the model file.
9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor
Machine program, which is characterized in that the processor realizes the trip as described in any one of claims 1 to 7 when executing described program
The step of processing for data of playing.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program realizes the place of the game data as described in any one of claims 1 to 7 when the computer program is executed by processor
The step of reason.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910037504.3A CN109767269B (en) | 2019-01-15 | 2019-01-15 | Game data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910037504.3A CN109767269B (en) | 2019-01-15 | 2019-01-15 | Game data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109767269A true CN109767269A (en) | 2019-05-17 |
CN109767269B CN109767269B (en) | 2022-02-22 |
Family
ID=66452946
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910037504.3A Active CN109767269B (en) | 2019-01-15 | 2019-01-15 | Game data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109767269B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111111193A (en) * | 2019-12-25 | 2020-05-08 | 北京奇艺世纪科技有限公司 | Game control method and device and electronic equipment |
CN111632384A (en) * | 2020-05-29 | 2020-09-08 | 网易(杭州)网络有限公司 | Game online number detection method, device, equipment and storage medium |
CN111722720A (en) * | 2020-06-22 | 2020-09-29 | 芯盟科技有限公司 | Man-machine interaction method, device and terminal |
CN111803957A (en) * | 2020-07-17 | 2020-10-23 | 网易(杭州)网络有限公司 | Player prediction method and device for online game, computer equipment and medium |
CN111861588A (en) * | 2020-08-06 | 2020-10-30 | 网易(杭州)网络有限公司 | Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device |
CN113457166A (en) * | 2021-07-20 | 2021-10-01 | 网易(杭州)网络有限公司 | Game player churn information processing method, device, equipment and storage medium |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130226856A1 (en) * | 2012-02-23 | 2013-08-29 | Palo Alto Research Center Incorporated | Performance-efficient system for predicting user activities based on time-related features |
CN104111920A (en) * | 2013-04-16 | 2014-10-22 | 华为技术有限公司 | Decision-making tree based prediction method and device |
CN104679777A (en) * | 2013-12-02 | 2015-06-03 | 中国银联股份有限公司 | Method and system for detecting fraudulent trading |
CN105930934A (en) * | 2016-04-27 | 2016-09-07 | 北京物思创想科技有限公司 | Prediction model demonstration method and device and prediction model adjustment method and device |
CN107230133A (en) * | 2017-05-26 | 2017-10-03 | 努比亚技术有限公司 | A kind of data processing method, equipment and computer-readable storage medium |
CN107545360A (en) * | 2017-07-28 | 2018-01-05 | 浙江邦盛科技有限公司 | A kind of air control intelligent rules deriving method and system based on decision tree |
CN107609708A (en) * | 2017-09-25 | 2018-01-19 | 广州赫炎大数据科技有限公司 | A kind of customer loss Forecasting Methodology and system based on mobile phone games shop |
CN108229986A (en) * | 2016-12-14 | 2018-06-29 | 腾讯科技(深圳)有限公司 | Feature construction method, information distribution method and device in Information prediction |
CN108268624A (en) * | 2018-01-10 | 2018-07-10 | 清华大学 | User data method for visualizing and system |
CN108665277A (en) * | 2017-03-27 | 2018-10-16 | 阿里巴巴集团控股有限公司 | A kind of information processing method and device |
-
2019
- 2019-01-15 CN CN201910037504.3A patent/CN109767269B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130226856A1 (en) * | 2012-02-23 | 2013-08-29 | Palo Alto Research Center Incorporated | Performance-efficient system for predicting user activities based on time-related features |
CN104111920A (en) * | 2013-04-16 | 2014-10-22 | 华为技术有限公司 | Decision-making tree based prediction method and device |
CN104679777A (en) * | 2013-12-02 | 2015-06-03 | 中国银联股份有限公司 | Method and system for detecting fraudulent trading |
CN105930934A (en) * | 2016-04-27 | 2016-09-07 | 北京物思创想科技有限公司 | Prediction model demonstration method and device and prediction model adjustment method and device |
WO2017186048A1 (en) * | 2016-04-27 | 2017-11-02 | 第四范式(北京)技术有限公司 | Method and device for presenting prediction model, and method and device for adjusting prediction model |
CN108229986A (en) * | 2016-12-14 | 2018-06-29 | 腾讯科技(深圳)有限公司 | Feature construction method, information distribution method and device in Information prediction |
CN108665277A (en) * | 2017-03-27 | 2018-10-16 | 阿里巴巴集团控股有限公司 | A kind of information processing method and device |
CN107230133A (en) * | 2017-05-26 | 2017-10-03 | 努比亚技术有限公司 | A kind of data processing method, equipment and computer-readable storage medium |
CN107545360A (en) * | 2017-07-28 | 2018-01-05 | 浙江邦盛科技有限公司 | A kind of air control intelligent rules deriving method and system based on decision tree |
CN107609708A (en) * | 2017-09-25 | 2018-01-19 | 广州赫炎大数据科技有限公司 | A kind of customer loss Forecasting Methodology and system based on mobile phone games shop |
CN108268624A (en) * | 2018-01-10 | 2018-07-10 | 清华大学 | User data method for visualizing and system |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111111193A (en) * | 2019-12-25 | 2020-05-08 | 北京奇艺世纪科技有限公司 | Game control method and device and electronic equipment |
CN111111193B (en) * | 2019-12-25 | 2023-09-22 | 北京奇艺世纪科技有限公司 | Game control method and device and electronic equipment |
CN111632384A (en) * | 2020-05-29 | 2020-09-08 | 网易(杭州)网络有限公司 | Game online number detection method, device, equipment and storage medium |
CN111632384B (en) * | 2020-05-29 | 2023-04-28 | 网易(杭州)网络有限公司 | Game online number detection method, device, equipment and storage medium |
CN111722720A (en) * | 2020-06-22 | 2020-09-29 | 芯盟科技有限公司 | Man-machine interaction method, device and terminal |
CN111803957A (en) * | 2020-07-17 | 2020-10-23 | 网易(杭州)网络有限公司 | Player prediction method and device for online game, computer equipment and medium |
CN111803957B (en) * | 2020-07-17 | 2024-02-09 | 网易(杭州)网络有限公司 | Method, device, computer equipment and medium for predicting players of online games |
CN111861588A (en) * | 2020-08-06 | 2020-10-30 | 网易(杭州)网络有限公司 | Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device |
CN111861588B (en) * | 2020-08-06 | 2023-10-31 | 网易(杭州)网络有限公司 | Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device |
CN113457166A (en) * | 2021-07-20 | 2021-10-01 | 网易(杭州)网络有限公司 | Game player churn information processing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109767269B (en) | 2022-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109767269A (en) | A kind for the treatment of method and apparatus of game data | |
CN107281755B (en) | Detection model construction method and device, storage medium and terminal | |
CN109408665A (en) | A kind of information recommendation method and device, storage medium | |
CN109815631A (en) | A kind for the treatment of method and apparatus of game data | |
CN110704674A (en) | Video playing integrity prediction method and device | |
CN108874832A (en) | Target, which is commented on, determines method and device | |
CN108388852A (en) | A kind of region crowd density prediction technique and device based on deep learning | |
CN108334575A (en) | A kind of recommendation results sequence modification method and device, electronic equipment | |
CN105225135B (en) | Potential customer identification method and device | |
CN109754290A (en) | A kind for the treatment of method and apparatus of game data | |
CN107481143A (en) | A kind of intelligent stock commending system and implementation method | |
CN110347724A (en) | Abnormal behaviour recognition methods, device, electronic equipment and medium | |
CN109711424A (en) | A kind of rule of conduct acquisition methods, device and equipment based on decision tree | |
CN106796618A (en) | Time series forecasting device and time sequence forecasting method | |
CN109726747A (en) | Recommend the data fusion sort method of platform based on social networks | |
CN113609193A (en) | Method and device for training prediction model for predicting customer transaction behavior | |
CN107451249B (en) | Event development trend prediction method and device | |
CN114693409A (en) | Product matching method, device, computer equipment, storage medium and program product | |
CN110069686A (en) | User behavior analysis method, apparatus, computer installation and storage medium | |
CN112184292A (en) | Marketing method and device based on artificial intelligence decision tree | |
CN112819499A (en) | Information transmission method, information transmission device, server and storage medium | |
CN106844765A (en) | Notable information detecting method and device based on convolutional neural networks | |
CN110215703A (en) | The selection method of game application, apparatus and system | |
CN110347934A (en) | A kind of text data filtering method, device and medium | |
CN110210884A (en) | Determine the method, apparatus, computer equipment and storage medium of user characteristic data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |