CN109767269A - A kind for the treatment of method and apparatus of game data - Google Patents

A kind for the treatment of method and apparatus of game data Download PDF

Info

Publication number
CN109767269A
CN109767269A CN201910037504.3A CN201910037504A CN109767269A CN 109767269 A CN109767269 A CN 109767269A CN 201910037504 A CN201910037504 A CN 201910037504A CN 109767269 A CN109767269 A CN 109767269A
Authority
CN
China
Prior art keywords
data
decision
information
user group
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910037504.3A
Other languages
Chinese (zh)
Other versions
CN109767269B (en
Inventor
陶建容
钟倩
巩琳霞
冯潞潞
沈乔治
范长杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netease Hangzhou Network Co Ltd
Original Assignee
Netease Hangzhou Network Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netease Hangzhou Network Co Ltd filed Critical Netease Hangzhou Network Co Ltd
Priority to CN201910037504.3A priority Critical patent/CN109767269B/en
Publication of CN109767269A publication Critical patent/CN109767269A/en
Application granted granted Critical
Publication of CN109767269B publication Critical patent/CN109767269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention provides a kind of processing methods of game data, comprising: obtains games log data;Game user is divided according to the games log data, obtains at least one user group's information;The corresponding characteristic dimension data of user group's information are extracted from the games log data;User group's information and corresponding characteristic dimension data are input to decision-tree model, decision-tree model and corresponding model file after being trained;Extract the decision path data in the model file;Decision Tree algorithms can preferentially be picked out in multidimensional characteristic differentiates that the strongest feature of performance is placed on close to the position of root node, so that researcher be helped to analyze Drain Causes importance;Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.

Description

A kind for the treatment of method and apparatus of game data
Technical field
The present invention relates to game technical fields, processing method and a kind of game data more particularly to a kind of game data Processing unit.
Background technique
In game company, customer churn is always one of to make, plan, run relevant departments the most concern, The quantity and consumption dynamics of user is to influence the important evidence in development of games direction, migration efficiency and subsequent popularization funds.It is right In current mainstream to the additional content charge stronger game on line of dependence, the cost cost for retaining an old user is about obtained 1/5 spent needed for a new user is obtained, it is also contemplated that a possibility that being lost high consumption user and new user develop into advanced use The cost at family, profit variance will also be increased further.Therefore, customer churn reason is analyzed, it is thus understood that the game of user It experiences and pointedly proposes evolutionary approach, the retention amount of game user can be improved, enhance game playability, promote business valence Value.
Existing customer churn analysis of causes method mainly includes user's investigation and the numerical analysis based on statistics;User's tune It grinds and is sampled by being lost user to part, randomly select certain customers and be investigated, investigation form is more various, common Form has questionnaire survey and telephone questionnaire etc., can intuitively obtain customer churn reason in this way.Based on statistics It is for statistical analysis to the games log data of user that numerical analysis specifically refers to operation department, extracts and is lost from database The information such as rate, retention ratio, online hours, task quantity performed, and Drain Causes are guessed and analyzed.Common method has Regression analysis, funnel analytic approach, feedback investigation method etc..
But for the method for user's investigation, low efficiency, human cost is high, and finding do not have generality and Versatility.And the prediction result subjectivity of the numerical analysis method based on statistics is strong, to operation department's correlation predictive staff Correlation experience require high, and can not differentiate the relative importance of multiple features, equally exist that low efficiency, human cost are high to ask Topic.
Summary of the invention
The embodiment of the present invention provides the processing method and a kind of corresponding processing unit of game data of a kind of game data.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of processing methods of game data, comprising:
Obtain games log data;
Game user is divided according to the games log data, obtains at least one user group's information;
The corresponding characteristic dimension data of user group's information are extracted from the games log data;
User group's information and corresponding characteristic dimension data are input to decision-tree model, determining after being trained Plan tree-model and corresponding model file;
Extract the decision path data in the model file.
Preferably, further includes:
It is visualized for the decision-tree model after the training, obtains model visualization result.
Preferably, described that game user is divided according to the games log data, obtain at least one user group The step of body information includes:
Extract the accumulative online hours in the preset time period of the games log data;
Game user is divided according to the accumulative online hours, obtains user group's information.
Preferably, described that the corresponding feature dimensions degree of user group's information is extracted from the games log data According to the step of include:
The corresponding original dimensions data of user group's information are extracted from the games log data;
Extract the characteristic dimension data in the original dimensions data.
Preferably, described that user group's information and corresponding characteristic dimension data are input to decision-tree model, it obtains The step of decision-tree model and corresponding model file after must training includes:
User group's information and corresponding characteristic dimension data are input to decision-tree model, determining after being trained Plan tree-model;
Decision-tree model after the training is parsed, the model file is obtained.
Preferably, before the step of decision path data extracted in the model file, further includes:
It is screened for the decision path data in the model file, the decision path data after being screened.
Preferably, the step of decision path data extracted in the model file include:
Decision path data after extracting the screening.
The embodiment of the invention also discloses a kind of processing units of game data, comprising:
Games log data acquisition module, for obtaining games log data;
User group's information acquisition module is obtained for dividing according to the games log data to game user At least one user group's information;
Characteristic dimension data extraction module, for extracting user group's information pair from the games log data The characteristic dimension data answered;
Training module, for user group's information and corresponding characteristic dimension data to be input to decision-tree model, Decision-tree model and corresponding model file after being trained;
Decision path data extraction module, for extracting the decision path data in the model file.
The embodiment of the invention also discloses a kind of electronic equipment, including memory, processor and storage are on a memory simultaneously The computer program that can be run on a processor, the processor realize the processing of above-mentioned game data when executing described program The step of.
The embodiment of the invention also discloses a kind of computer readable storage medium, deposited on the computer readable storage medium Computer program is contained, the computer program realizes the processing of above-mentioned game data when being executed by processor the step of.
The embodiment of the present invention includes following advantages:
In the embodiment of the present invention, games log data are obtained;Game user is drawn according to the games log data Point, obtain at least one user group's information;It is corresponding that user group's information is extracted from the games log data Characteristic dimension data;User group's information and corresponding characteristic dimension data are input to decision-tree model, trained Decision-tree model and corresponding model file afterwards;Extract the decision path data in the model file;Output is described certainly Plan path data;Analyze the important behaviour behavior of customer churn from the user behaviors log of game user, and to these behaviors into Row importance ranking embodies the high efficiency and scalability of the program;Decision Tree algorithms can preferentially be selected in multidimensional characteristic Differentiate that the strongest feature of performance is placed on close to the position of root node out, so that researcher be helped to analyze Drain Causes importance; Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing;
Fig. 1 is a kind of step flow chart of the processing method embodiment one of game data of the embodiment of the present invention;
Fig. 2 is a kind of step flow chart of the processing method embodiment two of game data of the embodiment of the present invention;
Fig. 3 is a kind of structural block diagram of the processing device embodiment of game data of the embodiment of the present invention.
Specific embodiment
The technical issues of in order to keep the embodiment of the present invention solved, technical solution and beneficial effect are more clearly understood, with The embodiment of the present invention is further described in lower combination accompanying drawings and embodiments.It should be appreciated that specific implementation described herein Example is only used to explain the present invention, is not intended to limit the present invention.
Referring to Fig.1, a kind of step process of the processing method embodiment one of game data of the embodiment of the present invention is shown Figure, can specifically include following steps:
Step 101, games log data are obtained;
In the concrete realization, the embodiment of the present invention can be applied in the terminal, for example, mobile phone, tablet computer, individual Digital assistants, wearable device (such as glasses, wrist-watch) and desktop computer etc..
In embodiments of the present invention, the operating system of mobile terminal may include Android (Android), IOS, Windows Phone, Windows etc..
In another preferred embodiment of the embodiment of the present invention, the embodiment of the present invention can also be applied in server, should Server may include server may include device in PC (Personal Computer, PC) service, it is mainframe, small Type machine, can also include Cloud Server, and the embodiment of the present invention does not limit the type and quantity of server specifically.
Specifically, the number when games log data may include game application operation about game user behavior According to;The games log data may include all kinds of log fields, timestamp, match information, Transaction Information, game information etc., sheet Inventive embodiments to this with no restriction.
It is further applicable in the embodiment of the present invention, which can get the game from game server Daily record data, i.e. the games log data can be stored in one or more game servers, and mobile terminal can pass through net Network is connect with the game server, gets the games log data by network.
When the embodiment of the present invention is applied to when the server, the server may include game server itself, The game server can call default process to obtain the games log data for being stored in memory, execute following related Data handling procedure;It is following to be illustrated by taking mobile terminal as an example.
Step 102, game user is divided according to the games log data, obtains at least one user group letter Breath;
Be specifically applied in the embodiment of the present invention, the mobile terminal can according to the games log data to game user into Row divides, and obtains at least one user group's information;Specifically, which can extract the games log number first Accumulative online hours data in divide game user according to the accumulative online hours data, obtain at least one A user group's information.
It should be noted that the accumulative online hours data can be in the preset time period after user's creation game account Time duration, the time duration in 24 hours after game account can be created for user such as accumulative online hours data.
For example, when the accumulative online hours data of the game user are no less than some preset time period, Ke Yibiao Remember that the game user to retain user, otherwise, marks the game user to be lost user;, and accumulative online hours data can be with For not less than 5 minutes, not less than 10 minutes, be not less than 30 minutes;Correspondingly, which can be divided into 5 minutes It retains user group, 5 minutes loss user groups, 10 minutes retention user groups, 10 minutes loss user groups, stay within 30 minutes Deposit user group and 30 minutes loss user groups etc..
Further, whether mobile terminal can have login behavior in the next day of creation game account according to game user, If there is login behavior in next day, marking the game user is next day to retain user group, and otherwise marking the game user is next day It is lost user group.
In a kind of preferred embodiment of the embodiment of the present invention, the games log data can also include purchase stage property quantity, It buys the stage property amount of money and game user grade is promoted and expends time etc., mobile terminal can be according to above-mentioned games log data needle Different user groups is divided into game user.
In another specific example of the embodiment of the present invention, the purchase stage property quantity in the games log data can be 0 kind, not less than 2 kinds, not less than 5 kinds, not less than 8 kinds, not less than 10 kinds, be not less than 15 kinds etc., according to purchase stage property quantity will Game user carries out the division of user group, obtains different retention user groups and is lost user group.
In another specific example of the embodiment of the present invention, the game user grade in the games log data promotes consumption It is time-consuming can for not less than 1 hour, not less than 6 hours, not less than 12 hours, not less than 24 hours, not less than 48 hours, no Less than 96 hours etc., is promoted according to game user grade and expend the time for the division of game user progress user group, obtained not Same retention user group and loss user group.
Above-mentioned user group divides several citings of the only embodiment of the present invention, can also pass through games log data In other data for it is described by game user carry out user group division, obtain different retention user group and loss User group, the embodiment of the present invention to this with no restriction.
Step 103, the corresponding characteristic dimension data of user group's information are extracted from the games log data;
It applies in the embodiment of the present invention, mobile terminal can extract the use from the games log data The corresponding characteristic dimension data of family community information;It should be noted that the games log data may include multiple original dimensions Data extract the characteristic dimension data from the original dimensions data;For example, which can wrap Include login times, online hours, virtual item total amount etc., the embodiment of the present invention to this with no restriction.
It should be noted that may include the corresponding mark (ID) of multiple game users in each user group's information;Institute The ID for stating game user is associated with the games log data, and the games log data may include multiple original dimensions Data;And the original dimensions data are the statistic or temporal characteristics of each game user;Then this feature dimension data is to pass through The statistic or temporal characteristics of each game user after screening.
Because the ID of the game user is associated with the games log data, user group's information can with it is initial Dimension data has mapping relations, i.e. user group's information can also have mapping relations with characteristic dimension data.
For example, which can be 5 minutes retention user groups, described 5 minutes retention user groups Corresponding characteristic dimension data may include: that daily task completes number, same day highest empirical value, new hand's guidance duration Deng.
The embodiment of the present invention does not limit the type of the characteristic dimension data in the games log data specifically.
Step 104, user group's information and corresponding characteristic dimension data are input to decision-tree model, are instructed Decision-tree model and corresponding model file after white silk;
It is further applicable in the embodiment of the present invention, obtains user group's information and corresponding characteristic dimension data Afterwards, mobile terminal can be input to decision-tree model, decision-tree model and corresponding model file after being trained.
It should be noted that the type of the decision-tree model may include ID3 (Iterative Dichotomiser, repeatedly For riffle) model, C4.5 model and CART (Classification and Regression Tree, classification and regression tree) Model etc., the embodiment of the present invention to this with no restriction.
Using user group's information and corresponding characteristic dimension data as sample, instructed for the decision-tree model Practice, the decision-tree model after being trained.
For example, mobile terminal can by 5 minutes above-mentioned retention user groups, 5 minutes loss user groups, stay within 10 minutes Deposit user group, 10 minutes loss user groups, 30 minutes retention user groups and 30 minutes loss user groups and its correspondence Characteristic dimension data be input in above-mentioned CART model, the CART model after being trained.
After decision-tree model after obtaining training, parsed for the decision-tree model after the training, and store File for more another specific formats is the model file, and for example, the file of the specific format can be json lattice Formula file.
Step 105, the decision path data in the model file are extracted.
It applies in the embodiment of the present invention, which can extract the decision path in the model file Data;Specifically, can be parsed for the model file, a large amount of decision path data are obtained, certain sieves are set Condition is selected to carry out screening or beta pruning for a large amount of decision path data.
It should be noted that including multiple nodes in every decision path data, each node indicates that some is determined Plan rule, i.e., every decision path data are made of multiple decision rules.
Preferably, the method can also include: the output decision path data.
It is specifically applied in the embodiment of the present invention, after obtaining the decision path data, which can be with more The file of kind format exports the decision path data for example can export the decision path number in table form According to, the embodiment of the present invention to the output format of decision path data with no restriction.
In the embodiment of the present invention, games log data are obtained;Game user is drawn according to the games log data Point, obtain at least one user group's information;It is corresponding that user group's information is extracted from the games log data Characteristic dimension data;User group's information and corresponding characteristic dimension data are input to decision-tree model, trained Decision-tree model and corresponding model file afterwards;Extract the decision path data in the model file;Output is described certainly Plan path data;Analyze the important behaviour behavior of customer churn from the user behaviors log of game user, and to these behaviors into Row importance ranking embodies the high efficiency and scalability of the program;Decision Tree algorithms can preferentially be selected in multidimensional characteristic Differentiate that the strongest feature of performance is placed on close to the position of root node out, so that researcher be helped to analyze Drain Causes importance; Compared to existing method, human cost is greatly reduced, improves efficiency, enhances confidence level.
Referring to Fig. 2, a kind of step process of the processing method embodiment two of game data of the embodiment of the present invention is shown Figure, can specifically include following steps:
Step 201, games log data are obtained;
The embodiment of the present invention can be applied in mobile terminal or server, which may include all kinds of days Will field, timestamp, match information, Transaction Information, game information etc., the mobile terminal can be got from game server The games log data, i.e. the games log data can be stored in game server, and mobile terminal can pass through network Get the games log data.
When the embodiment of the present invention is applied to when the server, the server may include game server itself, The game server can call default process to obtain the games log data for being stored in memory, following with mobile terminal For be illustrated.
Step 202, the accumulative online hours in the preset time period of the games log data are extracted;
Specifically, the mobile terminal can identify being accumulated in the preset time period of the games log data Line duration;For example, the scheduled time section can be 24 hours or 48 hours, calculating described is to swim in 24 hours or 48 hours The accumulative online hours of play user.
Step 203, game user is divided according to the accumulative online hours, obtains user group's information;
Further, which can divide game user according to accumulative online hours, obtain the use Family community information;
For example, add up online hours data can for not less than 5 minutes, not less than 10 minutes, be not less than 30 minutes Deng, then user group's information can be divided into 5 minutes retention user groups, 5 minutes loss user groups, 10 minutes retain User group, 10 minutes loss user groups, 30 minutes retention user groups and 30 minutes loss user groups.
Step 204, the corresponding original dimensions data of user group's information are extracted from the games log data;
It is specifically applied in the embodiment of the present invention, there is mapping to close for user group's information and the original dimensions data System, it is corresponding that mobile terminal can extract user group's information from the games log data according to the mapping relations Original dimensions data.
Step 205, the characteristic dimension data in the original dimensions data are extracted;
It is further applicable in the embodiment of the present invention, mobile terminal can extract the feature from original dimensions data Dimension data;Specifically, mobile terminal receives certain preset threshold values, the original dimensions data are extracted according to the threshold value In characteristic dimension data.
Step 206, user group's information and corresponding characteristic dimension data are input to decision-tree model, are instructed Decision-tree model after white silk;
It should be noted that the type of the decision-tree model may include ID3 (Iterative Dichotomiser, repeatedly For riffle) model, C4.5 model and CART (Classification and Regression Tree, classification and regression tree) Model etc., the embodiment of the present invention to this with no restriction.
Using user group's information and corresponding characteristic dimension data as sample, instructed for the decision-tree model Practice, the decision-tree model after being trained.
Step 207, the decision-tree model after the training is parsed, obtains the model file;
In a kind of specific example of the embodiment of the present invention, after decision-tree model after obtaining training, for the training Decision-tree model afterwards is parsed, and the file for being stored as certain specific formats is the model file, for example, institute The file for stating specific format can be json formatted file.
In a kind of preferred embodiment of the embodiment of the present invention, further includes: carried out for the decision-tree model after the training Visualization obtains model visualization result.
Step 208, it is screened for the decision path data in the model file, the decision path after being screened Data;
Specifically, the certain screening conditions of setting can be directed to a large amount of decision path data by mobile terminal It is screened, for example, the screening conditions may include loss number of users accounting or retention number of users on certain node Amount accounting reaches first threshold (such as 0.85);Alternatively, the loss number of users accounting on certain node occurs compared to a upper node Total number of persons in steep increasing and the node is greater than second threshold.
It should be noted that above-mentioned screening conditions are only several citings of the embodiment of the present invention, the embodiment of the present invention Screening conditions are not limited specifically.
In a kind of preferred embodiment of the embodiment of the present invention, beta pruning behaviour can also be carried out for the decision path data Make,
Step 209, the decision path data are exported.
In the embodiment of the present invention, after obtaining the decision path data, which can text in a variety of formats Part exports the decision path data for example can export the decision path data, for example, can in table form Exporting the decision path data into the table of xls or xlsx format.
In the embodiment of the present invention, games log data are obtained;In the preset time period for extracting the games log data Accumulative online hours;Game user is divided according to the accumulative online hours, obtains user group's information;From The games log data extract the corresponding original dimensions data of user group's information;Extract the original dimensions number Characteristic dimension data in;User group's information and corresponding characteristic dimension data are input to decision-tree model, obtained Decision-tree model after must training;Decision-tree model after the training is parsed, the model file is obtained;For institute The decision path data stated in model file are screened, the decision path data after being screened;Export the decision path Data;The important behaviour behavior of customer churn is analyzed from the user behaviors log of game user, and these behaviors is carried out important Property sequence, embody the high efficiency and scalability of the program;Decision Tree algorithms can preferentially pick out differentiation in multidimensional characteristic The strongest feature of performance is placed on close to the position of root node, so that researcher be helped to analyze Drain Causes importance;Compared to Existing method, greatly reduces human cost, improves efficiency, enhances confidence level;To the timely of the behavior of game user Understand and analyze, and therefrom generate Drain Causes, game developer can be made to understand the shortcoming inside game in time, can also be made Development of games portion makes the content of game and timely adjusts, and improves game to the attraction and commercial value of user.
In order to make those skilled in the art more fully understand the embodiment of the present invention, it is illustrated with a specific example.
Step 1: data acquisition
Games log data of the game user since the creation game account are extracted, wherein games log data include each Class log field, timestamp and relevant detailed log information, are convenient for further data processing.
Step 2: group divides
It is illustrated by taking different losing issues as an example, for different losing issues, user can be divided into loss Group and retention two class of group.Such as 5 minutes losing issues, 24 since creating game account in counting user log are small When in accumulative online hours, if accumulative online hours are not less than 5 minutes, which can be marked as retaining user;Otherwise it marks It is denoted as and is lost user.10 minutes loss user groups, 10 minutes retention user groups, 30 minutes loss user groups can similarly be obtained With 30 minutes retention user groups.For next day losing issue, if newly-built user logs in behavior in creation game account next day, The user is then divided into next day retention group, the user is otherwise divided into next day loss group.
Step 3: Feature Engineering
For 5 minutes losing issues, 10 minutes losing issues and 30 minutes losing issues, corresponding loss can be extracted User group create whole games log data in game account 24 hours and retain user at accumulative 5 minutes, 10 minutes and Games log data within the scope of 30 minutes carry out Feature Engineering, i.e., from the original dimensions in the games log data Data extract characteristic dimension data.For next day losing issue, extracts and be lost user group and retention user group's creation Whole games log data on the game account same day.After obtained games log data, we are extracted in games log data Characteristic dimension data, this feature dimension data may include the statistics such as match information, Transaction Information, the game information of user And temporal characteristics, totally 60 dimension, specific as shown in table 1.
Table 1: the characteristic dimension tables of data of the embodiment of the present invention
Step 4: attrition prediction
In order to carry out Causative analysis on going away, need first to carry out attrition prediction, using CART model to the games log of user Entrained information is analyzed in data, and whether prediction user can be lost.CART model algorithm is by the life of feature selecting, tree At and beta pruning composition the decision Tree algorithms that can be used for classifying and returning, which passes through meter when solving regression problem It calculates gini index (GINI value) and selects optimal characteristics, determine the optimal two-value cut-off of this feature, recursively two points of each features, It feature space is divided into limited unit, and determines the probability distribution of prediction on these units, that is, input is given Under the conditions of the conditional probability distribution that exports.Wherein, gini index reflects the confusion degree of data acquisition system.When gini index is bigger When, current data gets over chaos, and node is more impure, therefore CART selects the attribute for keeping the GINI value of child node small as the side of division Case.When the algorithm is applied on regression problem, the principle of feature selecting is slightly modified, for divide after sample mean square deviation it And minimum.
The data set that positive negative sample balances is divided into training set and test set, is fitted to obtain above-mentioned decision tree with training set Model, and the training for being lost model is completed, i.e., by the use as evaluation index using precision, recall, f1 core Family community information and corresponding characteristic dimension data are input to decision-tree model, complete the training of decision-tree model.
Step 5: model visualization
Carrying out two classification predictions using decision-tree model is actually that will predict process according to the feature or attribute of data It is decomposed into the decision of multiple subproblems, to carry out the reasoning task of " dividing and rule ".The regular partition of entire decision-tree model Mechanism is more huge, and for the ease of carrying out analysis and understanding and delivery, after completing the attrition prediction based on decision-tree model, this is determined The internal structure of plan tree-model is visualized, model visualization is obtained as a result, to which analysis decision tree-model is being predicted Decision process in the process.
Step 6: Drain Causes generate
After obtaining above-mentioned decision-tree model, decision-tree model is parsed by parsing leaf node, to obtain from root Each decision path data and relevant Decision information of the node to leaf node.However, with the intensification of decision-tree model depth, Number of paths increases index again, and entire decision tree gradually bulky complex is unfavorable for further analyzing.Therefore it needs to decision Tree-model is parsed, and screening conditions, the number of output in control decision path, to extract height are added in control decision path Imitate practical characteristic of division information.
After completing attrition prediction model training, the depth capacity of control decision tree-model is needed, and model file is protected .json structure is saved as, the information of each node in decision-tree model is saved in the model file, such as node serial number, positive and negative sample The information such as this quantity, Gini index.Each series of data structures is used to save the corresponding information of each node and each node Text information in this model file is extracted and is saved by corresponding father node and child node information, by decision tree mould Type is restored from text angle, so as to complete the Reading text of decision-tree model and the task of parsing.
Huge for decision path data bulk after completing the parsing task of decision-tree model, rule system is complicated The problem of, can one or more preset screening conditions screening and beta pruning are carried out to decision path, at present more it is succinct effectively Screening conditions have:
1, the loss number of users accounting on certain node or retention number of users accounting reach first threshold (such as 0.85), mention It takes from root node and takes an examination to a series of decision conditions including the node, be denoted as decision path, then show user group from starting 0.5 after a series of two component selections reach the node, can preferably distinguish and be lost user and retain user.
2, the number of users accounting that is lost on certain node occurs the total number of persons increased suddenly and on the node compared to a upper node Greater than second threshold, extracts and take an examination from root node to a series of decision conditions including the node, be denoted as decision path, then show Meet the classification performance of the group of certain condition on this node preferably, i.e., the node retains performance shadow to the loss of this kind of user Sound is larger.
3, certain node is lost sample accounting greater than retention sample accounting and the retention sample accounting of a upper node is greater than loss Sample accounting is extracted and is taken an examination from root node to a series of decision conditions including the node, is denoted as decision path, then shows the section The Rule of judgment of point is on the loss for the user group for meeting certain condition and retains influence greatly.
By screening conditions by after decision-tree model beta pruning, new decision-tree model is reconstructed.From the root of new model Node r sets out, and finds a series of new leaf nodes and is denoted as L={ l1,l2,...li...ln}.From each leaf node liIt sets out backtracking One can be obtained by leaf node l to root node riThe path for leading to root node r will preserve in these paths, and by its In node arranged from shallow to deep according to depth locating for node, finally obtain a series of decision path W={ w1,w2, ...wi,...wnIt is decision tree decision path data after beta pruning.
By the model information after reconstruct, the quantity including user on decision rule represented by each node, each node Etc. information, output include above-mentioned a plurality of decision path data in each file, each path is in .xlsx file at .xlsx file In output form it is corresponding as shown in table 2.
Table 2: the example of the decision path data in the embodiment of the present invention
If table 2 embodies the decision rule of certain decision path in decision-tree model, every a line corresponds to a decision rule, A node i.e. in decision-tree model other than leaf node.The first row of every a line corresponds to decision condition, successively stays later Deposit quantity, be lost quantity, retain number accounting, be lost number accounting, retain sample totality accounting, be lost sample totality accounting, This feature retains full dose sample accounting and this feature is lost full dose sample accounting.According to these by screening and the division item simplified The automatic calculating and comparative analysis of part and statistic, can complete the Drain Causes based on decision tree and automatically generate, significantly simple The workflow and intensity for having changed artificial analysis, show high efficiency and practicability.
It should be noted that for simple description, therefore, it is stated as a series of action groups for embodiment of the method It closes, but those skilled in the art should understand that, embodiment of that present invention are not limited by the describe sequence of actions, because according to According to the embodiment of the present invention, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art also should Know, the embodiments described in the specification are all preferred embodiments, and the related movement not necessarily present invention is implemented Necessary to example.
Referring to Fig. 3, a kind of structural block diagram of the processing device embodiment of game data of the embodiment of the present invention is shown, is had Body may include following module:
Games log data acquisition module 301, for obtaining games log data;
User group's information acquisition module 302 is obtained for dividing according to the games log data to game user Obtain at least one user group's information;
Characteristic dimension data extraction module 303, for extracting user group's letter from the games log data Cease corresponding characteristic dimension data;
Training module 304, for user group's information and corresponding characteristic dimension data to be input to decision tree mould Type, decision-tree model and corresponding model file after being trained;
Decision path data extraction module 305, for extracting the decision path data in the model file.
Preferably, further includes:
Model visualization result obtains module, for being visualized for the decision-tree model after the training, obtains Model visualization result.
Preferably, user group's information acquisition module includes:
Add up online hours extracting sub-module, it is accumulative in the preset time period for extracting the games log data Online hours;
User group's information acquisition submodule is obtained for being divided according to the accumulative online hours to game user Obtain user group's information.
Preferably, the characteristic dimension data extraction module includes:
Original dimensions data extracting sub-module, for extracting user group's information pair from the games log data The original dimensions data answered;
Characteristic dimension data extracting sub-module, for extracting the characteristic dimension data in the original dimensions data.
Preferably, the training module includes:
Decision-tree model obtains submodule, for user group's information and corresponding characteristic dimension data to be input to Decision-tree model, the decision-tree model after being trained;
Model file obtains submodule and obtains the model for parsing the decision-tree model after the training File.
Preferably, the module being connected with the decision path data extraction module, further includes:
Module is obtained, for being screened for the decision path data in the model file, determining after being screened Plan path data.
Preferably, the decision path data extraction module includes:
Extracting sub-module, for extracting the decision path data after the screening.
The embodiment of the invention also discloses a kind of electronic equipment, including memory, processor and storage are on a memory simultaneously The computer program that can be run on a processor, the processor realize the processing of the game data when executing described program The step of.
The embodiment of the invention also discloses a kind of computer readable storage medium, deposited on the computer readable storage medium The step of containing computer program, the processing of the game data realized when the computer program is executed by processor.
For device embodiment, since it is basically similar to the method embodiment, related so being described relatively simple Place illustrates referring to the part of embodiment of the method.
All the embodiments in this specification are described in a progressive manner, the highlights of each of the examples are with The difference of other embodiments, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can provide as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.
The embodiment of the present invention be referring to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these Computer program instructions are set to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine, so that being held by the processor of computer or other programmable data processing terminal devices Capable instruction generates for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of specified function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing terminal devices In computer-readable memory operate in a specific manner, so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one side of one or more flows of the flowchart and/or block diagram The function of being specified in frame or multiple boxes.
These computer program instructions can also be loaded into computer or other programmable data processing terminal devices, so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one or more flows of the flowchart And/or in one or more blocks of the block diagram specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also including other elements that are not explicitly listed, or further includes for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device for including the element.
Above to a kind of processing method and a kind of processing unit of game data of game data provided by the present invention, into It has gone and has been discussed in detail, used herein a specific example illustrates the principle and implementation of the invention, the above implementation The explanation of example is merely used to help understand method and its core concept of the invention;Meanwhile for the general technology people of this field Member, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this explanation Book content should not be construed as limiting the invention.

Claims (10)

1. a kind of processing method of game data characterized by comprising
Obtain games log data;
Game user is divided according to the games log data, obtains at least one user group's information;
The corresponding characteristic dimension data of user group's information are extracted from the games log data;
User group's information and corresponding characteristic dimension data are input to decision-tree model, the decision tree after being trained Model and corresponding model file;
Extract the decision path data in the model file.
2. the method according to claim 1, wherein further include:
It is visualized for the decision-tree model after the training, obtains model visualization result.
3. method according to claim 1 or 2, which is characterized in that described to use according to the games log data game The step of family is divided, at least one user group's information is obtained include:
Extract the accumulative online hours in the preset time period of the games log data;
Game user is divided according to the accumulative online hours, obtains user group's information.
4. method according to claim 1 or 2, which is characterized in that described to extract institute from the games log data The step of stating user group's information corresponding characteristic dimension data include:
The corresponding original dimensions data of user group's information are extracted from the games log data;
Extract the characteristic dimension data in the original dimensions data.
5. method according to claim 1 or 2, which is characterized in that described by user group's information and corresponding spy The step of sign dimension data is input to decision-tree model, decision-tree model and corresponding model file after being trained include:
User group's information and corresponding characteristic dimension data are input to decision-tree model, the decision tree after being trained Model;
Decision-tree model after the training is parsed, the model file is obtained.
6. according to the method described in claim 5, it is characterized in that, the decision path number extracted in the model file According to the step of before, further includes:
It is screened for the decision path data in the model file, the decision path data after being screened.
7. according to the method described in claim 6, it is characterized in that, the decision path number extracted in the model file According to the step of include:
Decision path data after extracting the screening.
8. a kind of processing unit of game data characterized by comprising
Games log data acquisition module, for obtaining games log data;
User group's information acquisition module obtains at least for dividing according to the games log data to game user One user group's information;
Characteristic dimension data extraction module, it is corresponding for extracting user group's information from the games log data Characteristic dimension data;
Training module is obtained for user group's information and corresponding characteristic dimension data to be input to decision-tree model Decision-tree model and corresponding model file after training;
Decision path data extraction module, for extracting the decision path data in the model file.
9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine program, which is characterized in that the processor realizes the trip as described in any one of claims 1 to 7 when executing described program The step of processing for data of playing.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes the place of the game data as described in any one of claims 1 to 7 when the computer program is executed by processor The step of reason.
CN201910037504.3A 2019-01-15 2019-01-15 Game data processing method and device Active CN109767269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910037504.3A CN109767269B (en) 2019-01-15 2019-01-15 Game data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910037504.3A CN109767269B (en) 2019-01-15 2019-01-15 Game data processing method and device

Publications (2)

Publication Number Publication Date
CN109767269A true CN109767269A (en) 2019-05-17
CN109767269B CN109767269B (en) 2022-02-22

Family

ID=66452946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910037504.3A Active CN109767269B (en) 2019-01-15 2019-01-15 Game data processing method and device

Country Status (1)

Country Link
CN (1) CN109767269B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111111193A (en) * 2019-12-25 2020-05-08 北京奇艺世纪科技有限公司 Game control method and device and electronic equipment
CN111632384A (en) * 2020-05-29 2020-09-08 网易(杭州)网络有限公司 Game online number detection method, device, equipment and storage medium
CN111722720A (en) * 2020-06-22 2020-09-29 芯盟科技有限公司 Man-machine interaction method, device and terminal
CN111803957A (en) * 2020-07-17 2020-10-23 网易(杭州)网络有限公司 Player prediction method and device for online game, computer equipment and medium
CN111861588A (en) * 2020-08-06 2020-10-30 网易(杭州)网络有限公司 Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device
CN113457166A (en) * 2021-07-20 2021-10-01 网易(杭州)网络有限公司 Game player churn information processing method, device, equipment and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226856A1 (en) * 2012-02-23 2013-08-29 Palo Alto Research Center Incorporated Performance-efficient system for predicting user activities based on time-related features
CN104111920A (en) * 2013-04-16 2014-10-22 华为技术有限公司 Decision-making tree based prediction method and device
CN104679777A (en) * 2013-12-02 2015-06-03 中国银联股份有限公司 Method and system for detecting fraudulent trading
CN105930934A (en) * 2016-04-27 2016-09-07 北京物思创想科技有限公司 Prediction model demonstration method and device and prediction model adjustment method and device
CN107230133A (en) * 2017-05-26 2017-10-03 努比亚技术有限公司 A kind of data processing method, equipment and computer-readable storage medium
CN107545360A (en) * 2017-07-28 2018-01-05 浙江邦盛科技有限公司 A kind of air control intelligent rules deriving method and system based on decision tree
CN107609708A (en) * 2017-09-25 2018-01-19 广州赫炎大数据科技有限公司 A kind of customer loss Forecasting Methodology and system based on mobile phone games shop
CN108229986A (en) * 2016-12-14 2018-06-29 腾讯科技(深圳)有限公司 Feature construction method, information distribution method and device in Information prediction
CN108268624A (en) * 2018-01-10 2018-07-10 清华大学 User data method for visualizing and system
CN108665277A (en) * 2017-03-27 2018-10-16 阿里巴巴集团控股有限公司 A kind of information processing method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226856A1 (en) * 2012-02-23 2013-08-29 Palo Alto Research Center Incorporated Performance-efficient system for predicting user activities based on time-related features
CN104111920A (en) * 2013-04-16 2014-10-22 华为技术有限公司 Decision-making tree based prediction method and device
CN104679777A (en) * 2013-12-02 2015-06-03 中国银联股份有限公司 Method and system for detecting fraudulent trading
CN105930934A (en) * 2016-04-27 2016-09-07 北京物思创想科技有限公司 Prediction model demonstration method and device and prediction model adjustment method and device
WO2017186048A1 (en) * 2016-04-27 2017-11-02 第四范式(北京)技术有限公司 Method and device for presenting prediction model, and method and device for adjusting prediction model
CN108229986A (en) * 2016-12-14 2018-06-29 腾讯科技(深圳)有限公司 Feature construction method, information distribution method and device in Information prediction
CN108665277A (en) * 2017-03-27 2018-10-16 阿里巴巴集团控股有限公司 A kind of information processing method and device
CN107230133A (en) * 2017-05-26 2017-10-03 努比亚技术有限公司 A kind of data processing method, equipment and computer-readable storage medium
CN107545360A (en) * 2017-07-28 2018-01-05 浙江邦盛科技有限公司 A kind of air control intelligent rules deriving method and system based on decision tree
CN107609708A (en) * 2017-09-25 2018-01-19 广州赫炎大数据科技有限公司 A kind of customer loss Forecasting Methodology and system based on mobile phone games shop
CN108268624A (en) * 2018-01-10 2018-07-10 清华大学 User data method for visualizing and system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111111193A (en) * 2019-12-25 2020-05-08 北京奇艺世纪科技有限公司 Game control method and device and electronic equipment
CN111111193B (en) * 2019-12-25 2023-09-22 北京奇艺世纪科技有限公司 Game control method and device and electronic equipment
CN111632384A (en) * 2020-05-29 2020-09-08 网易(杭州)网络有限公司 Game online number detection method, device, equipment and storage medium
CN111632384B (en) * 2020-05-29 2023-04-28 网易(杭州)网络有限公司 Game online number detection method, device, equipment and storage medium
CN111722720A (en) * 2020-06-22 2020-09-29 芯盟科技有限公司 Man-machine interaction method, device and terminal
CN111803957A (en) * 2020-07-17 2020-10-23 网易(杭州)网络有限公司 Player prediction method and device for online game, computer equipment and medium
CN111803957B (en) * 2020-07-17 2024-02-09 网易(杭州)网络有限公司 Method, device, computer equipment and medium for predicting players of online games
CN111861588A (en) * 2020-08-06 2020-10-30 网易(杭州)网络有限公司 Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device
CN111861588B (en) * 2020-08-06 2023-10-31 网易(杭州)网络有限公司 Training method of loss prediction model, player loss reason analysis method and player loss reason analysis device
CN113457166A (en) * 2021-07-20 2021-10-01 网易(杭州)网络有限公司 Game player churn information processing method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109767269B (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN109767269A (en) A kind for the treatment of method and apparatus of game data
CN107281755B (en) Detection model construction method and device, storage medium and terminal
CN109408665A (en) A kind of information recommendation method and device, storage medium
CN109815631A (en) A kind for the treatment of method and apparatus of game data
CN110704674A (en) Video playing integrity prediction method and device
CN108874832A (en) Target, which is commented on, determines method and device
CN108388852A (en) A kind of region crowd density prediction technique and device based on deep learning
CN108334575A (en) A kind of recommendation results sequence modification method and device, electronic equipment
CN105225135B (en) Potential customer identification method and device
CN109754290A (en) A kind for the treatment of method and apparatus of game data
CN107481143A (en) A kind of intelligent stock commending system and implementation method
CN110347724A (en) Abnormal behaviour recognition methods, device, electronic equipment and medium
CN109711424A (en) A kind of rule of conduct acquisition methods, device and equipment based on decision tree
CN106796618A (en) Time series forecasting device and time sequence forecasting method
CN109726747A (en) Recommend the data fusion sort method of platform based on social networks
CN113609193A (en) Method and device for training prediction model for predicting customer transaction behavior
CN107451249B (en) Event development trend prediction method and device
CN114693409A (en) Product matching method, device, computer equipment, storage medium and program product
CN110069686A (en) User behavior analysis method, apparatus, computer installation and storage medium
CN112184292A (en) Marketing method and device based on artificial intelligence decision tree
CN112819499A (en) Information transmission method, information transmission device, server and storage medium
CN106844765A (en) Notable information detecting method and device based on convolutional neural networks
CN110215703A (en) The selection method of game application, apparatus and system
CN110347934A (en) A kind of text data filtering method, device and medium
CN110210884A (en) Determine the method, apparatus, computer equipment and storage medium of user characteristic data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant