CN107423442A - Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis - Google Patents

Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis Download PDF

Info

Publication number
CN107423442A
CN107423442A CN201710666989.3A CN201710666989A CN107423442A CN 107423442 A CN107423442 A CN 107423442A CN 201710666989 A CN201710666989 A CN 201710666989A CN 107423442 A CN107423442 A CN 107423442A
Authority
CN
China
Prior art keywords
user
data
recommended models
basic
characteristic vector
Prior art date
Application number
CN201710666989.3A
Other languages
Chinese (zh)
Inventor
刘冶
李宏浩
桂进军
傅自豪
彭楠
印鉴
Original Assignee
火烈鸟网络(广州)股份有限公司
中山大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 火烈鸟网络(广州)股份有限公司, 中山大学 filed Critical 火烈鸟网络(广州)股份有限公司
Priority to CN201710666989.3A priority Critical patent/CN107423442A/en
Publication of CN107423442A publication Critical patent/CN107423442A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The present invention provides a kind of game recommdation method and system for behavioural analysis of being drawn a portrait based on user.By construction feature collector, the data reported to user's representation data, list of application data, client are handled, and are standardized and meet the characteristic vector of mathematical modeling requirement.It is predicted using multiple basic recommended models, generates preliminary user and apply recommendation list and respective downloaded probability;With reference to probability and physical tags training Fusion Model is downloaded, final application recommendation list is generated.By the multi dimensional analysis to user's history user behaviors log, feature extraction structure user's representation data warehouse is carried out.For basic recommended models, the sequential relationship of shot and long term memory network study user behavior is innovatively introduced, preferably portrays fancy grade of the user to article, institute's recommended games are applied high with the demand matching degree of user.Add integrated study and carry out Model Fusion, integrate the learning outcome of each model, improve the stability and generalization ability of proposed algorithm.

Description

Based on user draw a portrait behavioural analysis application recommend method and system, storage medium and Computer equipment

Technical field

The invention belongs to technical field of network information, more particularly to a kind of application recommendation for behavioural analysis of being drawn a portrait based on user Method, a kind of commending system of applying for behavioural analysis of being drawn a portrait based on user, a kind of computer-readable storage media, and a kind of meter Calculate machine equipment.

Background technology

In recent years, with the high speed development of mobile Internet industry, the information content of the Internet bearer also shows quick-fried therewith Hairdo increases.All kinds of mobile Internet information carriers, the mode of the diversified acquisition information content is provided the user, but also therewith The puzzlement of information overload is brought, user is changed into passively receiving a large amount of internets subscription and push from active searching for Internet content Etc. information, while user is also caused to obtain the cost of information and accordingly increase.Under specific business scenario, with reference to the behavior of user Custom is analyzed, there is provided meeting the content of user preferences and service becomes the demand of core, to solve user's request, interconnection The recommendation method and system in net field arises at the historic moment.

Currently, the number of applications of the big mobile system platforms of iOS and Android two alreadys exceed the scale of million ranks.It is huge The big economic prosperity that application market is brought using scale, wherein game application income occupy the most receipts of application market Enter.While application market provides the user with species abundant application selection, the puzzlement using selection is also brought.Therefore, for Family provides the application recommendation service for meeting its individual demand, can both lift Consumer's Experience, can increase platform income again.

More wide variety of application at present recommends method mainly to have collaborative filtering recommending, content-based recommendation and hidden Semantic model recommendation etc..Collaborative filtering recommending method is applied primarily to electric business industry, and it typically uses arest neighbors technology, utilizes The history preference information of user calculates the distance between user, and then article is evaluated using the nearest neighbor of targeted customer Weighting evaluation value predicts fancy grade of the targeted customer to special article, system so as to according to this fancy grade come to target User is recommended.But this recommendation method is directed to the user with historical record data.Especially, when user is During new user, due to not history of existence operation data, now carry out Products Show for user and certain difficulty be present.Based on content Recommendation method, then user and article are portrayed by the characteristic attribute of correlation, this method be based on user and article characteristics, study use Family interest, so as to be recommended according to user with the interest matching degree of article.But content-based recommendation method needs artificial Extract significant feature, it is desirable to portray fancy grade of the user to article as far as possible.Hidden semantic model is recommended, its core concept It is exactly that user interest and article are contacted by hidden feature.The implementation process of the model generally comprises three parts, first by thing Product are mapped to implicit classification, secondly determine interest of the user to implicit classification, finally select the thing in user's classification interested Product recommend user.Statistics of this kind of method based on user behavior data, then automatic cluster is carried out, find out potential theme or divide Class.

However, the data source of above-mentioned model is all relatively simple, in different context environmentals, it is bad to have stability, The shortcomings that generalization ability difference, make to decline using the degree of accuracy of recommendation results.

The content of the invention

Based on this, it is an object of the present invention to provide a kind of application recommendation method based on user's portrait behavioural analysis, one Application commending system, a kind of computer-readable storage media and a kind of computer of the kind based on user's portrait behavioural analysis being set It is standby, it is possible to increase the stability and generalization ability that application is recommended, the application for providing the user personalization are recommended.

The present invention is realized by following scheme:

It is a kind of that recommendation method is applied based on user's portrait behavioural analysis, comprise the following steps:Obtain what client reported User action log, and be stored in server basis database;By construction feature collector, to user's representation data, original Beginning list of application data and the User action log data carry out data acquisition, cleaning, standardization and feature group Close and extraction, obtain unified standard and meet the characteristic vector of mathematical modeling requirement;Multiple default bases are called to recommend mould Type carries out computing to the characteristic vector respectively, and the Preliminary Applications for obtaining relative users under each basic recommended models recommend row Table, and relative users are to the download probability of various applications in the Preliminary Applications recommendation list;Each basis is recommended What model obtained downloads probability as new characteristic vector input, and is used as fusion to recommend mould whether to be applied described in actual download The label of type, train the fusion recommended models pre-set;Call newly-increased feature vector of the fusion recommended models to user Handled, obtain the final application recommendation list to relative users.

The present invention's applies recommendation method based on user's portrait behavioural analysis, by the more of user's history user behaviors log Dimensional analysis, feature extraction structure user's representation data warehouse is carried out to daily record.Fusion recommended models are added, incorporate each base The learning outcome of plinth recommended models, improve the stability and generalization ability of proposed algorithm, the application of recommendation and the demand of user Matching degree is high.

In one embodiment of the invention, the User action log data include user's application installation list, equipment Information, game login time, game is supplemented with money and consumption information.

By comprehensive collection to User action log data, can more accurately hold the interest of user, improve to Recommend user the accuracy of application in family.

In one embodiment of the invention, to user's representation data, the original application table data and institute State User action log data and carry out data acquisition, cleaning, standardization and combinations of features and extraction, obtain unified standard And the step of meeting the characteristic vector of mathematical modeling requirement, includes:

User's representation data, the original application table data and the User action log data are adopted Sample, various sampled datas are obtained, wherein, the sampled data includes:Numeric data, text data, time series data and enumerator According to;

The dimension needed for each basic operations model is separated into according to different chronomeres to time series data;Logarithm value number According to progress Z-score standardizations;Semantic analysis is carried out to text data;To enumerating grouped data, entered using one-hot coding Row processing;

Feature extraction is carried out to each item data after processing, and the characteristic vector to extracting carries out dimension-reduction treatment;

Generation is with unified standard and meets the characteristic vector of mathematical modeling requirement, including:User characteristics vector, using spy Sign vector, behavioural characteristic vector sum interaction feature vector.

By construction feature collector to user's representation data, the original application table data and the user User behaviors log data are sampled, and are obtained numeric data, text data, time series data and are enumerated data, and to every kind of data from Different dimensions are expanded, and can improve the stability and generalization ability of proposed algorithm.

In one embodiment of the invention, multiple default basic recommended models are called to enter respectively to the characteristic vector Row computing, the Preliminary Applications recommendation list of relative users under each basic recommended models, and relative users are obtained to described first Step applies the step of download probability of various applications in recommendation list to include:

Each basic recommended models are trained using k folding cross-validation methods, in the training stage, mould is recommended on each basis Type carries out arameter optimization using grid data service, obtains optimized parameter, and generate each user under each basic recommended models Preliminary Applications recommendation list, and user is to the download probability of various applications in the Preliminary Applications recommendation list.

Computing is carried out to the characteristic vector of user respectively by basic recommended models, Preliminary Applications is obtained and recommends row Table, and the download probability of wherein various applications, fusion recommended models that can be to the second layer are effectively trained.With reference to all The result of basic recommended models, there is participation to recommend equivalent to multiple basic recommended models, recommendation results are more accurate.

In one embodiment of the invention, bag is trained to each basic recommended models using k folding cross-validation methods Include following steps:

The training sample set of each basic recommended models is divided into k size be identical and the subset of content mutual exclusion;

K iteration is carried out, each iteration is using the union of k-1 subset as training set, and remaining subset is as survey Examination collection, obtained k groups training set and test set is carried out the training of the basic recommended models.

It is trained by rolling over cross-validation method using k to each basic recommended models, improves single basic recommended models Generalization ability.

In one embodiment of the invention, using slip window sampling to user's representation data, the original application Table data and the User action log data are sampled, and obtain sampled data;Pushed away calling multiple default bases Recommend in the step of model carries out computing to the characteristic vector respectively, the machine learning method bag that the basic recommended models use Include:Logistic regression, adaptive enhancing, SVMs and random forest, and also include the study side of shot and long term memory network Method;And when being trained using shot and long term memory network, on the basis of the sliding window of sampling often slides data set once, Input using the data set as corresponding basic recommended models, the basic recommended models are used with the instruction of back-propagation algorithm Practice algorithm to be trained.

During basic recommended models generate Preliminary Applications recommendation list, the factor of sequential is considered, due to sliding window The sample characteristics of mouth method collection span cycle regular hour, can learn the dependence onto sequential, and the present invention is innovative Ground introduces length memory network (LSTM), learns the sequential relationship of user behavior, obtains Personalization recommendation model, the model can Fancy grade of the user to article is preferably portrayed, institute's recommended games are applied high with the demand matching degree of user.

In one embodiment of the invention, back-propagation algorithm includes:

The output valve of each neuron of forward calculation;

The error entry value of each neuron of backwards calculation, including both direction, one of direction be by error term along when Between backpropagation, error term upper layer is is propagated in another direction;

According to corresponding error term, the gradient of each weight is calculated.

Effectively the shot and long term memory network of foundation can be trained by back-propagation algorithm, obtain accurate mould Type data.

In one embodiment of the invention, the system that a kind of application is recommended also is provided, including:

Behavioral data acquisition module, the User action log reported for obtaining client, and it is stored in server basis In database;

Characteristic extracting module, for by construction feature collector, to user's representation data, original application table data with And the User action log data carry out data acquisition, cleaning, standardization and combinations of features and extraction, are unified Specification and meet the characteristic vector of mathematical modeling requirement;

Basic model computing module, for calling multiple default basic recommended models to be carried out respectively to the characteristic vector Computing, the Preliminary Applications recommendation list of relative users under each basic recommended models, and relative users are obtained to described preliminary Using the download probability of various applications in recommendation list;

Recommended models module is merged, for the download probability that obtains each basic recommended models as new spy Sign vector input, and whether to apply the label as fusion recommended models described in actual download, train the fusion pre-set Recommended models;

Merge recommended models computing module, call it is described fusion recommended models to the newly-increased feature vector of user at Reason, obtains the final application recommendation list to relative users.

The commending system of applying of the behavioural analysis of being drawn a portrait based on user of the present invention, behavioral data acquisition module obtain user's row For daily record data, characteristic extracting module by construction feature collector, to user's representation data, original application table data and The User action log data carry out data acquisition, cleaning, standardization and combinations of features and extraction, obtain unified rule The characteristic vector of model;Basic model computing module is first handled the characteristic vector using multiple basic recommended models, is obtained Preliminary Applications recommendation list is obtained, and user pushes away to the download probability of various applications in the Preliminary Applications recommendation list, fusion Recommend model module training and obtain fusion recommended models, fusion recommended models computing module calls the fusion recommended models processing to obtain Obtain the application recommendation list of user.By the multi dimensional analysis to user's history user behaviors log, feature extraction structure is carried out to daily record Build user's representation data warehouse.Fusion recommended models are added, the learning outcome of each basic recommended models is incorporated, improves and push away The stability and generalization ability of algorithm are recommended, the application of recommendation is high with the demand matching degree of user.

In one embodiment of the invention, a kind of computer-readable storage media is also provided, stores computer thereon Program, it is characterised in that being drawn based on user as described in above-mentioned any one is realized when the computer program is executed by processor The step of as behavioural analysis using recommendation method.

In one embodiment of the invention, a kind of computer equipment, including holder, processor and storage are also provided And can be by the computer program of the computing device, described in the computing device during computer program in the holder The step of realizing the application recommendation method for behavioural analysis of being drawn a portrait based on user of any one as described above.

The present invention, should with reference to gaming platform on the basis of tradition is based on the application recommendation method of user's portrait behavioural analysis The characteristics of with environment, it is proposed that a kind of to apply recommendation method based on user's portrait behavioural analysis, this method is drawn a portrait based on user Data warehouse technology carries out structured features extraction, while has incorporated integrated study framework and Recognition with Recurrent Neural Network algorithm, can The effect of user-customized recommended service is preferably improved, lifts the experience of gaming platform user.

The present invention one is processing initial characteristic data modeling framework, i.e. construction feature collector, and possessing it can be by difference The initial data characteristic processing of type into standardization and the function of the characteristic vector of mathematical modeling training can be carried out, wherein original Data can be numeric data, text data, time series data, enumerate data etc.;Second, fusion recommended models, i.e., will be multiple single Grader is merged, and model is obtained the Generalization Capability more superior than single grader, is recommended so as to generate.Due to recommending The diversity of the application scenarios of systems face, under different scenes, the prediction that single proposed algorithm obtains may be present The bad situation of generalization ability, therefore the combined strategy of multiple proposed algorithms provided by the invention is particularly important.Mould is recommended in fusion Type is to build and combine the advantage of multiple proposed algorithms, is learnt from other's strong points to offset one's weaknesses, combination forms a powerful recommendation framework.In addition, with The behavioural habits at family may be influenceed and be changed by time series factor, therefore in sequential behavioural characteristic, this hair The bright basic forecast algorithm in fusion recommended models first layer adds shot and long term memory network (LSTM) to handle time series data, Using multiple proposed algorithm prediction results as new feature, the prediction algorithm of the second layer is trained, obtains final game application row Table.

Therefore effective processing is played in the sequential behavior to user.User's sequential behavior, the i.e. sequential of customer consumption product Information hiding the rule of data variation, and contacting between user and product can be excavated using these rules.And sequential behavior pair It is important using having the function that accordingly whether prediction user clicks on download.User clicks on and checked after the application of some classification, Likely continue to click on the application for checking same classification.In recent years, due to Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNN) there is the ability of Series Modeling, so as to fast in fields such as natural language processing, image recognition, speech recognitions Speed is widely applied.For example, Google realizes the significantly lifting of mechanical translation quality using RNN, RNN obtains more and more Concern, RNN also begins to trial and is used in recommendation field.

The present invention proposes a kind of characteristic collector of automation, you can logarithm Value Data, text data, time series data, Enumerate data etc. to be cleaned, be processed into the form of standardization, so as to extract a kind of device of characteristic vector.

The present invention proposes fusion recommended models, in the application of application commending system, it is proposed that shot and long term remembers net Network (LSTM) excavates sequential behavioural characteristic, so as to pass through the potential relation that timing information analyzes user and product.

Brief description of the drawings

Fig. 1 is the flow chart using recommendation method based on user's portrait behavioural analysis in the embodiment of the present invention;

Fig. 2 is the principle schematic for carrying out data acquisition in the embodiment of the present invention using slip window sampling;

Fig. 3 is the principle schematic of the characteristic collector built in the embodiment of the present invention;

Fig. 4 is the principle schematic of the fusion recommended models built in the embodiment of the present invention;

Fig. 5 is the structural representation using commending system based on user's portrait behavioural analysis in the embodiment of the present invention.

Embodiment

Referring to Fig. 1, it is the flow using recommendation method based on user's portrait behavioural analysis in the embodiment of the present invention Figure, it is described that recommendation method is applied based on user's portrait behavioural analysis, comprise the following steps:

S101, the User action log that client reports is obtained, and be stored in server basis database;

S102, by construction feature collector, to user's representation data, original application table data and user's row Data acquisition, cleaning, standardization and combinations of features and extraction are carried out for daily record data, obtain unified standard and is met The characteristic vector of mathematical modeling requirement;

S103, call multiple default basic recommended models to carry out computing to the characteristic vector respectively, obtain each base The Preliminary Applications recommendation list of relative users under plinth recommended models, and relative users are to each in the Preliminary Applications recommendation list The download probability of kind application;

S104, the probability of downloading that each basic recommended models are obtained input as new characteristic vector, and with reality The label whether applied as fusion recommended models is downloaded on border, trains the fusion recommended models pre-set;

S105, call the fusion recommended models to handle the newly-increased feature vector of user, obtain to relative users Final application recommendation list.

The present invention's applies recommendation method based on user's portrait behavioural analysis, by construction feature collector, to user Representation data, original application table data and the User action log data carry out data acquisition, cleaning, standardization And combinations of features and extraction, the characteristic vector of unified standard is obtained, and use multiple basic recommended models first to the feature Vector is handled, and obtains Preliminary Applications recommendation list, and user is to various applications in the Preliminary Applications recommendation list Probability is downloaded, fusion recommended models are obtained so as to train, the application for calling the fusion recommended models processing to obtain user is recommended List.By the multi dimensional analysis to user's history user behaviors log, feature extraction structure user's representation data storehouse is carried out to daily record Storehouse.Add fusion recommended models, incorporate the learning outcomes of each basic recommended models, improve proposed algorithm stability and Generalization ability, the application of recommendation are high with the demand matching degree of user.

In one embodiment of the present of invention, step S101, the User action log data include user's application installation List, facility information, login time of playing, game is supplemented with money and consumption information.

Client can obtain and report of user user behaviors log, report daily record include user's application installation list, facility information, Game login time, game the information such as are supplemented and consumed with money, and store it in server basis database.

By the multi dimensional analysis to user's history user behaviors log, structure user's representation data warehouse, recommendation can be improved The generalization ability of algorithm.

In one embodiment of the present of invention, step S102, to user's representation data, the original application list number According to this and the User action log data carry out data acquisition, cleaning, standardization and combinations of features and extraction, obtain Unified standard and include the step of meet the characteristic vector of mathematical modeling requirement:

User's representation data, the original application table data and the User action log data are adopted Sample, various sampled datas are obtained, wherein, the sampled data includes:Numeric data, text data, time series data and enumerator According to;

The dimension needed for each basic operations model is separated into according to different chronomeres to time series data;Logarithm value number According to progress Z-score standardizations;Semantic analysis is carried out to text data;To enumerating grouped data, entered using one-hot coding Row processing;

Feature extraction is carried out to each item data after processing, and the characteristic vector to extracting carries out dimension-reduction treatment;

Generation is with unified standard and meets the characteristic vector of mathematical modeling requirement, including:User characteristics vector, using spy Sign vector, behavioural characteristic vector sum interaction feature vector.

By construction feature collector to user's representation data, the original application table data and the user User behaviors log data are sampled, and are obtained numeric data, text data, time series data and are enumerated data, and to every kind of data from Different dimensions are expanded, and can improve the stability and generalization ability of proposed algorithm.

The reported data obtained according to step S101, this step S102 can carry out the structure of characteristic collector.Collection apparatus Device, it is clear that the data mainly reported to original user's representation data, original list of application data, client carry out data Wash, be processed into the form of standardization, so as to extract a kind of device of data characteristics vector.The collection apparatus built in the present invention Device can customize and it is extensive arrive a variety of data application scenes, based on input data source, data source conversion can be entered into line number The characteristic vector of model training is learned, wherein data source can be numeric data, text data, time series data, enumerate data etc..

The step of construction feature collector, is specific as follows:

(1) interface in input data source is write, basic data can be passed to using the interface;

(2) by numeric data, text data, time series data, enumerate data etc. reason for standardization characteristic vector;

(3) interface of output characteristic vector is write, the characteristic vector of standardization can be obtained using the interface.

Wherein, the method for above-mentioned data processing includes data prediction, dimension, feature binaryzation, missing values replacement etc..

The present invention establishes a kind of general feature and carried in the original temporal daily record class data that data report aspect to record Take method.

Defined in the present invention and user characteristics (F is useduser), using feature (Fapp), behavioural characteristic (Fact) and interaction Feature (Finter), wherein:

User characteristics (Fuser) include user's registration duration, hour of log-on, age, sex, device systems, liveness and disappear Take Capability index etc.;

Using feature (Fapp) include game application classification, shelf life, restocking duration, the number of starts, user's viscosity and fill It is worth order numbers etc.;

Behavioural characteristic (Fact) include clicking on game application details page, click on the time, if subscribe to game application, if under Carry game application, if add game group and game evaluation etc.;

Interaction feature (Finter) mode that is combined two-by-two between feature or between multiple features generates interaction feature, wrap Include user and game application, user and game classification, game application and game classification etc..

Preferably, in characteristic processing, the different dimensions such as time, counting, ratio is added to the four classes sampled data and come Expand, and then derive as such as nearly n days order numbers, nearly n days any active ues, daily Add User within nearly n days, daily consume gold within nearly n days Volume etc. (n=1,2,3,5,7,15 etc.), the method for data processing include data prediction, dimension, feature binaryzation, missing values Replace.

Specifically, during step S102 feature construction, to the timestamp class such as spy such as hour of log-on, shelf life Sign, by the dimension required for multiple models such as temporal separation Cheng Yue, week, day, hour;Logarithm Value Types, such as the age, actively The features such as degree, the number of starts, to reduce interference of the numerical value to algorithm, Z-score standardizations are carried out to it;To text type Feature, such as game evaluation, mainly using sentiment analysis, assess the pouplarity of the game application and be used as its feature;It is right The data characteristics of classification type, such as game application classification, user's sex, employ one-hot coding (One-Hot Encoding) Processing mode.

Based on features described above structure and characteristic processing on, in feature extraction and dimension-reduction treatment, using the spy of penalty term Back-and-forth method is levied, have chosen the weight coefficient that logistic regression (Logistic Regression, LR) model calculates each feature, Characteristic item of the filtering less than threshold coefficient.

In an advantageous embodiment, data sampling is carried out using slip window sampling, chooses the sample in a time interval As feature, the sample data in next period is slided into next notebook data as label with the duration where label Period is sampled next time.As shown in Figure 2.

In the present invention, data source after data characteristics collector for processing, can generate four category features vector, i.e., user characteristics, Using feature, behavioural characteristic and interaction feature.Specific block flow diagram is as shown in Figure 3.

In an advantageous embodiment, the characteristic vector after processing is stored in property data base by the present invention, the spy Database is levied by the way of relevant database and non-relational database combine.

, it is necessary to which the characteristic vector after processing is stored in database after data characteristics collector is built.Specifically, Present invention uses relevant database and non-relational database, including MongoDB, MySQL, Redis database.Wherein MongoDB and MySQL is used for storing daily off-line data feature, and fusion recommended models are carried out according to daily offline feature vector Updating maintenance;Redis is then used for the characteristic vector that real-time storage current date increases newly, and commending system can carry out real-time to user Personalized recommendation.

In one embodiment of the invention, step S103 includes:

Each basic recommended models are trained using k folding cross-validation methods, in the training stage, mould is recommended on each basis Type carries out arameter optimization using grid data service, obtains optimized parameter, and generate each user under each basic recommended models Preliminary Applications recommendation list, and user is to the download probability of various applications in the Preliminary Applications recommendation list.

Using proposed algorithm module, from user base feature, historical behavior and application feature etc., each sample data is given birth to Into game application list.Specifically, given user, obtains the characteristic vector such as list of application and user's history behavior from database and enters Row splicing, i.e. [Fuser,Fapp,Fact,Finter], judge whether user can download, from classification problem is converted into, calculates user and download Game Probability p (y=1 | Fuser,Fapp,Fact,Finter), the probability size for the game application that as user may download.

In an advantageous embodiment, the method that basic recommended models mainly use machine learning, including following two Individual or more combination:Combinational logic return (LR), adaptive enhancing (Adaptive Boosting, AdaBoost), support to Amount machine (Support Vector Machine, SVM) and random forest (Random Forest, RF) and shot and long term memory network (LSTM).The flow of method model proposed by the present invention be basic data after characteristic collector, obtain output characteristic vector, Characteristic vector is inputed into classifier training, so as to obtain game content to be recommended and service.

Computing is carried out to the characteristic vector of user respectively by basic recommended models, Preliminary Applications is obtained and recommends row Table, and the download probability of wherein various applications, fusion recommended models that can be to the second layer are effectively trained.With reference to all The result of basic recommended models, there is participation to recommend equivalent to multiple basic recommended models, recommendation results are more accurate.

Often there is bigger difference in the application scenarios that commending system needs to face, single to push away under different scenes Recommending the prediction that algorithm obtains may have that Generalization Capability is bad, while can not well handle and be pushed away under several scenes Problem is recommended, rationally there can be obvious effect promoting than single model algorithm using the blending algorithm of model, therefore merge multiple The combined strategy of proposed algorithm is particularly important.

In one embodiment of the invention, each basic recommended models are trained using k folding cross-validation methods Step includes:

The training sample set of each basic recommended models is divided into k size be identical and the subset of content mutual exclusion;

K iteration is carried out, each iteration is using the union of k-1 subset as training set, and remaining subset is as survey Examination collection, obtained k groups training set and test set is carried out the training of the basic recommended models.

Cross-validation method is rolled over by using k, data characteristics training can be better profited to adjust ginseng, single basis is improved and push away Recommend the Generalization Capability of model classifiers.

In the present invention, the base learning algorithm (base learning algorithm) that uses mainly have logistic regression (LR), Adaptive enhancing (AdaBoost), SVMs (SVM), random forest (RF) and shot and long term memory network (LSTM) classification Device.Wherein in LR models, the probabilistic model of user's download games application is:

In formula, y={ 0,1 } is grouped data, and p is the probability of the classification in corresponding y, and w is weight matrix, x be characterized to Amount, b is bias term.

In the training stage, the classifier methods of each basic recommended models carry out arameter optimization using grid data service, obtained Game application list to be recommended and corresponding download probability are generated to optimized parameter, and to each user.

In one embodiment of the invention, using slip window sampling to user's representation data, the original application Table data and the User action log data are sampled, and obtain sampled data;Pushed away calling multiple default bases Recommend in the step of model carries out computing to the characteristic vector respectively, the machine learning method bag that the basic recommended models use Include:Logistic regression, adaptive enhancing, SVMs and random forest, and also include the study side of shot and long term memory network Method;And when being trained using shot and long term memory network, on the basis of the sliding window of sampling often slides data set once, Input using the data set as corresponding basic recommended models, the basic recommended models are used with the instruction of back-propagation algorithm Practice algorithm to be trained.

The problem of interim sequence characteristics of algorithm meet of commending system are handled, such as the shot and long term interest of user can become Change, innovatively adding shot and long term memory network (LSTM) in fusion recommended models in the present invention is used to be fitted sequential behavior Information, to learn the dependence in sequential.

During basic recommended models generate Preliminary Applications recommendation list, the factor of sequential is considered, due to sliding window The sample characteristics of mouth method collection span cycle regular hour, can learn the dependence onto sequential, introduce length memory Network (LSTM), learns the sequential relationship of user behavior, obtains Personalization recommendation model, and the model can preferably portray user To the fancy grade of article, institute's recommended games are applied high with the demand matching degree of user.

In Preliminary Applications recommendation list is treated in the generation of basic recommended models, the factor of sequential is considered, due to slip window sampling The sample characteristics of collection span cycle regular hour, in order to learn the dependence onto sequential, in algorithm policy layer The data set sampled for sliding window adds shot and long term memory network (LSTM) and is trained, i.e. the number in every slip once On the basis of collection, the input using the data set as model, the training algorithm of model is mainly back-propagation algorithm.

In one embodiment of the invention, back-propagation algorithm includes:

The output valve of each neuron of forward calculation;

The error entry value of each neuron of backwards calculation, including both direction, one of direction be by error term along when Between backpropagation, error term upper layer is is propagated in another direction;

According to corresponding error term, the gradient of each weight is calculated.

Effectively the shot and long term memory network of foundation can be trained by back-propagation algorithm, obtain accurate mould Type data.

The back-propagation algorithm, main calculation procedure are:

(1) output valve of each neuron of forward calculation, i.e. ft, it, ct, ot, htFive vectorial values, ftRepresent Forget Gate forgets door in the state output of t, itRepresent Input Gate input gates in the state output of t, ctRepresent in t The state value at moment, otRepresent Output Gate out gates in the state output of t, htRepresent to remember in t shot and long term The output of network (LSTM) model;

(2) the error term δ values of each neuron of backwards calculation, including both direction, one of direction are along the time Backpropagation, i.e., since current t, the error term at each moment is calculated, another direction is by error term upper layer Propagate;

(3) according to corresponding error term, the gradient of each weight is calculated.

The model can be defined as:

ft=σ (Wf*[ht-1,xt]+bf)

it=σ (Wi*[ht-1,xt]+bi)

ot=σ (Wo*[ht-1, xt]+bo)

ht=ot*tanh(Ct)

Wherein, sigma function represents Sigmoid layers, and tanh functions are tanh layers, and its expression is to be respectively:

In shot and long term memory network (LSTM), the advantage of its relative cycle neutral net (RNN) is can be by information Persistence, the model pass through Gate Mechanism (mechanism of door), i.e. Input Gate, Forget Gate, Output Gate carries out computing, and wherein Forget gate represent the C of last momentt-1How many remains into current time Ct, Input Gate represents the input x at current timetHow many is saved in CtIn, Output Gate are used for controlling CtHow many is output to currently The output valve h at momentt, the value boil down to 0 or 1 of vector, here 0 expression are rejected input value by them by Sigmoid layers, 1 represent allow input value by and participate in follow-up computing, concretely comprise the following steps:

(1) Forget Gate Sigmoid layers determine which information will be from CtMiddle rejecting, Forget Gate can bases [ht-1,xt], i.e. the output h in last momentt-1Be formed by connecting with the sample characteristics for sliding into current time t vector (including User characteristics, using feature, behavioural characteristic and interaction feature) as t feature input;

(2) Input Gate Sigmoid layers and tanh layers determine which information updated.Specifically, Input Gate Sigmoid layers determine to update which information, tanh layers can calculate one it is newWillAnd Ct-1Value combine and obtain Obtain the C of the t times renewalt

(3) the Output Gate of Sigmoid layers are determined the h of outputt, i.e., by CtAfter being handled by tanh layers ( Output valve is between -1 to 1) output o with Output Gate Sigmoid layerstIt is multiplied, obtains final result.

For step S104, the Preliminary Applications recommendation list that each basic recommended models are obtained, and mutually apply Family inputs to the probability of downloading of various applications in the Preliminary Applications recommendation list as new characteristic vector, and with actual download The label whether applied as fusion recommended models, trains the fusion recommended models pre-set;

Fusing stage is to carry out second of fusion to the Preliminary Applications recommendation list of S103 generations to recommend, and generates final application Recommendation list.The fusion recommended models of the present invention are can be directed to a variety of business scenarios to carry out optimal models selection and merge, The advantages of different model learnings made full use of to different characteristic.The specific steps of recommended models processing are merged in the present invention It is:

(1) each basic model is predicted, generates and recommend to each user, recommendation results include the to be recommended of user List of application and corresponding download probability;

(2) inputted the download probability that previous step basic model exports as feature, whether actual download, which plays to be used as, melts The label of recommended models, training fusion recommended models are closed, and assess fusion recommended models prediction effect;

(3) fusion recommended models are generated according to above-mentioned two step, each user is recommended, generates final application Recommendation list.

In fusing stage, in order to avoid the risk of over-fitting, untapped sample in step S103 is used in step S104 Data produce training sample, and the principle framework of the fusion recommended models of the integrated study is as shown in Figure 4.Simultaneously in current business Under scene, based on product and operation rule, in the application recommendation list ultimately generated, some also can be set and rejects rule, example Such as filter because time attenuation factor causes by the application of undercarriage, to filter out the ineligible recommendation results in the part;Also can add Enter some rules for being used for adjustment control, as set the application that download is most in certain time, or within operation rule Game application etc..

In one embodiment of the invention, a kind of application commending system for behavioural analysis of being drawn a portrait based on user is also provided, As shown in figure 5, described included based on user's portrait behavioural analysis using commending system:

Behavioral data acquisition module 10, the User action log reported for obtaining client, and it is stored in server base In plinth database;

Characteristic extracting module 20, for by construction feature collector, to user's representation data, original application table data And the User action log data carry out data acquisition, cleaning, standardization and combinations of features and extraction, are united One specification and meet the characteristic vector of mathematical modeling requirement;

Basic model computing module 30, for calling multiple default basic recommended models to enter respectively to the characteristic vector Row computing, the Preliminary Applications recommendation list of relative users under each basic recommended models, and relative users are obtained to described first Step applies the download probability of various applications in recommendation list;

Recommended models module 40 is merged, for the download probability that obtains each basic recommended models as new spy Sign vector input, and whether to apply the label as fusion recommended models described in actual download, train the fusion pre-set Recommended models;

Merge recommended models computing module 50, call the fusion recommended models to the newly-increased feature vector of user at Reason, obtains the application recommendation list to relative users.

The commending system of applying of the behavioural analysis of being drawn a portrait based on user of the present invention, behavioral data acquisition module obtain user's row For daily record data, characteristic extracting module by construction feature collector, to user's representation data, original application table data and The User action log data carry out data acquisition, cleaning, standardization and combinations of features and extraction, obtain unified rule The characteristic vector of model;Basic model computing module is first handled the characteristic vector using multiple basic recommended models, is obtained Preliminary Applications recommendation list is obtained, and user pushes away to the download probability of various applications in the Preliminary Applications recommendation list, fusion Recommend model module training and obtain fusion recommended models, fusion recommended models computing module calls the fusion recommended models processing to obtain Obtain the application recommendation list of user.By the multi dimensional analysis to user's history user behaviors log, feature extraction structure is carried out to daily record Build user's representation data warehouse.Fusion recommended models are added, the learning outcome of each basic recommended models is incorporated, improves and push away The stability and generalization ability of algorithm are recommended, the application of recommendation is high with the demand matching degree of user.

The application recommendation method and system of behavioural analysis proposed by the present invention of being drawn a portrait based on user, can be divided into following 5 Module:Client, server, database, characteristic collector and proposed algorithm module.Client can obtain User action log simultaneously Report, report log content to include user's application and list, facility information, login time of playing are installed, is played and is supplemented with money and consume Information, and store it in server basis database;Characteristic collector then performs data cleansing, standardization, combinations of features With the function such as extraction, by the sample data of reported data and original user's representation data storehouse be processed into the feature of standardization to Amount, and store it in property data base;Proposed algorithm module can call the feature in property data base to be modeled, and carry For interface to server calls.When user is asked using client to service end, server can call proposed algorithm mould Block, content and service to the application of client Resume Mission.

In one embodiment of the invention, a kind of computer-readable storage media is also provided, stores computer thereon Program, it is characterised in that the computer program realizes being drawn a portrait based on user described in above-mentioned any one when being executed by processor The step of application recommendation method of behavioural analysis.

In one embodiment of the invention, a kind of computer equipment, including holder, processor and storage are also provided And can be by the computer program of the computing device, described in the computing device during computer program in the holder The step of realizing the application recommendation method of the behavioural analysis of being drawn a portrait based on user described in above-mentioned any one.

Embodiment described above only expresses the several embodiments of the present invention, and its description is more specific and detailed, but simultaneously Can not therefore it be construed as limiting the scope of the patent.It should be pointed out that come for one of ordinary skill in the art Say, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention Scope.

Claims (10)

1. a kind of apply recommendation method based on user's portrait behavioural analysis, it is characterised in that:Comprise the following steps:
The User action log that client reports is obtained, and is stored in server basis database;
By construction feature collector, to user's representation data, original application table data and the User action log number According to data acquisition, cleaning, standardization and combinations of features and extraction is carried out, obtaining unified standard and meet mathematical modeling will The characteristic vector asked;
Call multiple default basic recommended models to carry out computing to the characteristic vector respectively, obtain each basic recommended models The Preliminary Applications recommendation list of lower relative users, and relative users are under various applications in the Preliminary Applications recommendation list Carry probability;
The probability of downloading that each basic recommended models are obtained inputs as new characteristic vector, and with described in actual download Using whether as the label for merging recommended models, the fusion recommended models pre-set are trained;
Call the fusion recommended models to handle the newly-increased feature vector of user, obtain the final application recommendation to user List.
2. according to claim 1 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that:
The User action log data include user application installation list, facility information, game login time, game supplement with money and Consumption information.
3. according to claim 1 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that to described User's representation data, the original application table data and the User action log data carry out data acquisition, cleaning, mark Standardization processing and combinations of features and extraction, obtain unified standard and are wrapped the step of meet the characteristic vector of mathematical modeling requirement Include:
User's representation data, the original application table data and the User action log data are sampled, Various sampled datas are obtained, wherein, the sampled data includes:Numeric data, text data, time series data and enumerate data;
The dimension needed for each basic operations model is separated into according to different chronomeres to time series data;Logarithm Value Data enters Row Z-score standardizations;Semantic analysis is carried out to text data;To enumerating grouped data, at one-hot coding Reason;
Feature extraction is carried out to each item data after processing, dimension-reduction treatment, generation tool are carried out using the Method for Feature Selection of penalty term Have unified standard and meet the characteristic vector of mathematical modeling requirement, including:User characteristics vector, using characteristic vector, behavior Characteristic vector and interaction feature vector.
4. according to claim 3 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that:
Call multiple default basic recommended models to carry out computing to the characteristic vector respectively, obtain each basic recommended models The Preliminary Applications recommendation list of lower relative users, and relative users are under various applications in the Preliminary Applications recommendation list The step of carrying probability includes:
Each basic recommended models are trained using k folding cross-validation methods, in the training stage, each basic recommended models are adopted Arameter optimization is carried out with grid data service, obtains optimized parameter, and it is first under each basic recommended models to generate each user Step applies recommendation list, and user to the download probability of various applications in the Preliminary Applications recommendation list.
5. according to claim 4 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that:Using k Folding cross-validation method is trained to each basic recommended models to be comprised the following steps:
The training sample set of each basic recommended models is divided into k size be identical and the subset of content mutual exclusion;
Carry out k iteration, each iteration uses the union of k-1 subset as training set, remaining subset as test set, Obtained k groups training set and test set are carried out to the training of the basic recommended models.
6. according to claim 1 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that:
Using slip window sampling to user's representation data, the original application table data and the User action log Data are sampled, and obtain sampled data;
In the step of calling multiple default basic recommended models to carry out computing to the characteristic vector respectively, the basis pushes away Recommending the machine learning method of model use includes:Logistic regression, adaptive enhancing, SVMs and random forest, and also Include the learning method of shot and long term memory network;
And when being trained using shot and long term memory network, the basis of data set once is often slided in the sliding window of sampling On, the input using the data set as corresponding basic recommended models, back-propagation algorithm is used to the basic recommended models Training algorithm be trained.
7. according to claim 6 apply recommendation method based on user's portrait behavioural analysis, it is characterised in that reversely passes Broadcasting algorithm includes:
The output valve of each neuron of forward calculation;
The error entry value of each neuron of backwards calculation, including both direction, one of direction are along the time by error term Backpropagation, another direction are to propagate error term upper layer;
According to corresponding error term, the gradient of each weight is calculated.
A kind of 8. system that application is recommended, it is characterised in that including:
Behavioral data acquisition module, the User action log reported for obtaining client, and it is stored in server basis data In storehouse;
Characteristic extracting module, for by construction feature collector, to user's representation data, original application table data and institute State User action log data and carry out data acquisition, cleaning, standardization and combinations of features and extraction, obtain unified standard And characteristic vector that meet mathematical modeling requirement;
Basic model computing module, for calling multiple default basic recommended models to be transported respectively to the characteristic vector Calculate, obtain the Preliminary Applications recommendation list of relative users under each basic recommended models, and relative users tentatively should to described With the download probability of various applications in recommendation list;
Recommended models module is merged, for the download probability that obtains each basic recommended models as new characteristic vector Input, and whether to apply the label as fusion recommended models described in actual download, train the fusion pre-set to recommend mould Type;
Recommended models computing module is merged, calls the fusion recommended models to handle the newly-increased feature vector of user, obtains Must be to the final application recommendation list of relative users.
9. a kind of computer-readable storage media, stores computer program thereon, it is characterised in that the computer program is located Manage realization when device performs and recommendation method is applied based on user's portrait behavioural analysis as described in claim 1 to 8 any one The step of.
10. a kind of computer equipment, including holder, processor and it is stored in the holder and can be by the processor The computer program of execution, realized described in the computing device during computer program as described in claim 1 to 8 any one Based on user draw a portrait behavioural analysis application recommendation method the step of.
CN201710666989.3A 2017-08-07 2017-08-07 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis CN107423442A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710666989.3A CN107423442A (en) 2017-08-07 2017-08-07 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710666989.3A CN107423442A (en) 2017-08-07 2017-08-07 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis

Publications (1)

Publication Number Publication Date
CN107423442A true CN107423442A (en) 2017-12-01

Family

ID=60436664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710666989.3A CN107423442A (en) 2017-08-07 2017-08-07 Method and system, storage medium and computer equipment are recommended in application based on user's portrait behavioural analysis

Country Status (1)

Country Link
CN (1) CN107423442A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021715A (en) * 2017-12-29 2018-05-11 西安交通大学 Isomery tag fusion system based on semantic structure signature analysis
CN108038237A (en) * 2017-12-27 2018-05-15 广州市云润大数据服务有限公司 A kind of information recommendation method and system
CN108256548A (en) * 2017-12-04 2018-07-06 北京大学 A kind of user's portrait depicting method and system based on Emoji service conditions
WO2019114422A1 (en) * 2017-12-15 2019-06-20 阿里巴巴集团控股有限公司 Model integration method and apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101755283A (en) * 2007-07-24 2010-06-23 三星电子株式会社 Method and apparatus for recommending information using hybrid algorithm
CN105893609A (en) * 2016-04-26 2016-08-24 南通大学 Mobile APP recommendation method based on weighted mixing
CN106056427A (en) * 2016-05-25 2016-10-26 中南大学 Spark-based big data hybrid model mobile recommending method
US20160372119A1 (en) * 2015-06-19 2016-12-22 Google Inc. Speech recognition with acoustic models

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101755283A (en) * 2007-07-24 2010-06-23 三星电子株式会社 Method and apparatus for recommending information using hybrid algorithm
US20160372119A1 (en) * 2015-06-19 2016-12-22 Google Inc. Speech recognition with acoustic models
CN105893609A (en) * 2016-04-26 2016-08-24 南通大学 Mobile APP recommendation method based on weighted mixing
CN106056427A (en) * 2016-05-25 2016-10-26 中南大学 Spark-based big data hybrid model mobile recommending method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄斌: "基于级联过滤和增强模型集成的推荐方法硏究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108256548A (en) * 2017-12-04 2018-07-06 北京大学 A kind of user's portrait depicting method and system based on Emoji service conditions
WO2019114422A1 (en) * 2017-12-15 2019-06-20 阿里巴巴集团控股有限公司 Model integration method and apparatus
CN108038237A (en) * 2017-12-27 2018-05-15 广州市云润大数据服务有限公司 A kind of information recommendation method and system
CN108021715A (en) * 2017-12-29 2018-05-11 西安交通大学 Isomery tag fusion system based on semantic structure signature analysis

Similar Documents

Publication Publication Date Title
Ngai et al. Application of data mining techniques in customer relationship management: A literature review and classification
Nisbet et al. Handbook of statistical analysis and data mining applications
Kuo A sales forecasting system based on fuzzy neural network with initial weights generated by genetic algorithm
Zheng Methodologies for cross-domain data fusion: An overview
Li Deep reinforcement learning: An overview
Takács et al. Investigation of various matrix factorization methods for large recommender systems
Chang et al. Integrating a piecewise linear representation method and a neural network model for stock trading points prediction
US20080027692A1 (en) Data visualization methods for simulation modeling of agent behavioral expression
Cui et al. Machine learning for direct marketing response models: Bayesian networks with evolutionary programming
Papageorgiou Review study on fuzzy cognitive maps and their applications during the last decade
US7328218B2 (en) Constrained tree structure method and system
Lu et al. Brain intelligence: go beyond artificial intelligence
US20140344013A1 (en) Method and apparatus for interactive evolutionary optimization of concepts
Bi et al. A big data clustering algorithm for mitigating the risk of customer churn
Hendrycks et al. Bridging nonlinearities and stochastic regularizers with gaussian error linear units
Zliobaite et al. Next challenges for adaptive learning systems
Arinze Selecting appropriate forecasting models using rule induction
Yang et al. Finding progression stages in time-evolving event sequences
JP2017520824A (en) Updating classifiers across common features
Hartford et al. Deep IV: A flexible approach for counterfactual prediction
US20120095943A1 (en) System for training classifiers in multiple categories through active learning
Agarwal et al. An interdisciplinary review of research in conjoint analysis: recent developments and directions for future research
CN104298682B (en) A kind of evaluation method and mobile phone of the information recommendation effect based on Facial Expression Image
Zarandi et al. A hybrid fuzzy intelligent agent‐based system for stock price prediction
Gandhi et al. Classification rule construction using particle swarm optimization algorithm for breast cancer data sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination