CN110110372A - A kind of user's timing behavior automatic segmentation prediction technique - Google Patents

A kind of user's timing behavior automatic segmentation prediction technique Download PDF

Info

Publication number
CN110110372A
CN110110372A CN201910279004.0A CN201910279004A CN110110372A CN 110110372 A CN110110372 A CN 110110372A CN 201910279004 A CN201910279004 A CN 201910279004A CN 110110372 A CN110110372 A CN 110110372A
Authority
CN
China
Prior art keywords
user
network
neural network
behavior
parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910279004.0A
Other languages
Chinese (zh)
Other versions
CN110110372B (en
Inventor
张伟
梁文伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
East China Normal University
Original Assignee
East China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by East China Normal University filed Critical East China Normal University
Priority to CN201910279004.0A priority Critical patent/CN110110372B/en
Publication of CN110110372A publication Critical patent/CN110110372A/en
Application granted granted Critical
Publication of CN110110372B publication Critical patent/CN110110372B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Strategic Management (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Molecular Biology (AREA)
  • Marketing (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Development Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Computer Hardware Design (AREA)
  • Geometry (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Recommendation based on short session is always a hot issue in recommender system.Recommendation based on short session means that the Continuous behavior according to user in a bit of time window will predict user future.Time window of traditional method generally according to fixed size, the timing behavior of user is divided into multiple short sessions, such division mode is there is including excessive user behavior in the excessive then short session of 1) time window, and too small, short session can not cover complete user's stage behavior;2) the problems such as being difficult the time window that setting one is suitable for all user behaviors.Therefore, the present invention provides a kind of user's timing behavior automatic segmentation prediction technique based on the study of depth serial reinforcement, user's sequence is divided without artificial, effective solution drawbacks described above.

Description

A kind of user's timing behavior automatic segmentation prediction technique
Technical field
The present invention relates to computer science and technology fields, and in particular to is a kind of based on level Recognition with Recurrent Neural Network and reinforcing User's timing behavior automatic segmentation prediction technique of study.
Background technique
Recommendation based on short session is always a hot issue in machine learning and recommender system field.Based on short session Recommendation mean that the Continuous behavior according to user in a bit of time window will predict user future.For example, user It registers in certain social networking application in one day 5 places;User clicks within a period of time for logging in certain e-commerce website 8 commodity etc..Traditional method is to be modeled by Recognition with Recurrent Neural Network to such short session.
However, time window of traditional method generally according to fixed size, the complete timing behavior of user is divided into Multiple short sessions, there is 1) for a user, time window is arranged can in excessive then short session for such division mode Can comprising multistage mutually to independent user behavior, in contrast, time window setting it is too small, short session can not cover one it is complete Whole user's stage behavior;2) the problems such as being difficult the time window that setting one is suitable for all user behaviors.Therefore, this hair It is bright to provide a kind of user's timing behavior automatic segmentation prediction technique based on level circulation neural network and intensified learning, it is not necessarily to Artificial divides user's sequence, effective solution drawbacks described above.
Summary of the invention
The present invention innovates for the first time provides a kind of user's timing behavior based on level Recognition with Recurrent Neural Network and intensified learning Automatic segmentation prediction technique, core is cutting of the Utilization strategies e-learning to user's time series data, and is followed using level Ring neural network models the session of division, predicts the behavior of future customer.Through retrieving, there is not yet it is any with The relevant prior art of the present invention or report.The present invention models user's time series using level Recognition with Recurrent Neural Network, The user behavior for considering different levels indicates, can efficiently extract out important sequence information.
User's timing behavior automatic segmentation prediction proposed by the present invention based on level Recognition with Recurrent Neural Network and intensified learning Method, comprising the following steps:
Step 1: choosing data set, and cutting data are training set, verifying collection and test set after pre-processing to data;
Step 2: the one-hot coding of user and timing behavior higher-dimension are indicated, are converted to low-dimensional using embedded technology Input of the dense vector as model;
Step 3: modeling time series data using level Recognition with Recurrent Neural Network, when Utilization strategies network generates each The action director of spacer step whether to sequence carry out cutting, then using sorter network complete to sequence future time walking be it is pre- It surveys;
Step 4: training pattern parameter, using training sample, according to the optimization network model of different target function stage Parameter, and utilize verifying the set pair analysis model parameter carry out tuning;
Step 5: using based on the user in the network model of level Recognition with Recurrent Neural Network and intensified learning prediction test set Next probable behavior.
In the present invention, user's timing behavior include user register behavior, user buy commodity behavior, user click Music behavior is listened in webpage behavior, user, all behavioral datas generallyd use for this field.
In the present invention, the data set includes: Gowalla data set, Foursquare data set, Amazon data set In one or more data sets, all overt behavior data sets generallyd use for this field.
In the step 1, data are pre-processed the following steps are included:
A1. behavior sequence data are temporally stabbed by user and is sorted from the distant to the near;
A2. operation is filtered to data wherein infrequently, deletes the user for occurring to be less than 10 behaviors, deleted and occur Less than the article of 5 user behaviors;
A3. a time window is selected, will be recorded as the sequence slit mode of cutting foundation as policy network The initial policy π of network0
In the step 1, cutting data are training set, and verifying collection and test set refer to each user, by time series data The last one place as test set, for the penultimate of time series data as verifying collection, remaining is accordingly to be regarded as training set.
In the step 2, obtain the input of model the following steps are included:
B1. data encoding: note shares N number of user and M place, using one-hot coding, i.e., is indicated with N-dimensional sparse vector User's set, the unique characteristic dimension of user are denoted as 1, remaining is all 0, are similarly applied to place;
B2. data insertionization: the numerical value vector that N-dimensional user vector is mapped to another low-dimensional using embedded technology is empty Between, as the input of model later, remember that transformed user vector set expression is U={ u1,u2,…,uN, place vector set Conjunction is expressed as P={ p1,p2,…,pM}。
In the step three, Recognition with Recurrent Neural Network refers to but is not limited to gating cycle unit networks, is remembered using shot and long term Recall network replacement also can, by taking time step t as an example, remember xtInput when for time step t, specific calculating process includes following step It is rapid:
C1. it calculates and updates door zt:
zt=σ (Wz·[ht-1,xt]+bz),
C2. resetting door r is calculatedt:
rt=σ (Wr·[ht-1,xt]+br),
C3. cryptomnesia state is calculated
C4. hidden state h is calculatedt:
Wherein, σ is sigmoid function, indicates that Hadamard is long-pending, the splicing of [] expression vector, * representing matrix multiplication,It is all the parameter that model can learn.
In the step 3, time series data is modeled using level Recognition with Recurrent Neural Network, u is indicated with userkFor, Include the following steps:
D1. sequence level Recognition with Recurrent Neural Network:
D11. list entries length is the location sequence of L, is denoted as
D12. it is calculated by Recognition with Recurrent Neural NetworkObtain the defeated of each time step of sequence level Out, it is denoted as
D2. session-level Recognition with Recurrent Neural Network:
D21. according to cutting strategy π, from sequence level Recognition with Recurrent Neural Network export selection cutting time step it is corresponding as a result, As the input of session-level Recognition with Recurrent Neural Network, length is | π |, it is denoted as
D22. it is calculated by Recognition with Recurrent Neural NetworkObtain session-level Recognition with Recurrent Neural Network Output, is denoted as
D3. according to time step be unfolded export: be by length | π | outputAccording to cutting strategy π, exhibition Open the output for being L for length.
In the step three, Utilization strategies network generates the movement of each time step, dynamic with what is generated in time step t Make atCiting, includes the following steps:
E1. definition status function st:
WhereinIndicate the splicing of vector,WithRespectively Recognition with Recurrent Neural Network is in sequence level and session-level Output in time step t;
E2. motion space a is definedt:
at={ 1,0 },
Wherein 1 expression current behavior belongs to current session, and 0 expression is not belonging to current session;
E3. definition strategy function π:
π(at|st;Θ)=σ (Wπ*st+bπ),
Wherein Wπ,bπFor the parameter of tactful network.In the training process, a is actedtValue by tactful π probability value sample institute , in test, movement a is depended on
In the step three, the behavior walked using sorter network forecasting sequence future time, with time step t predicted time Walk t+1 citing, comprising the following steps:
F1. splicing user indicates the sum of the output with level Recognition with Recurrent Neural Network:
F2. full articulamentum is added on it:
Wherein Wo,boIt is the parameter of sorter network, dimension consistent with place number is M,Indicate that the user of prediction exists The place that time step t+1 is gone to, is indicated with one-hot coding.
In the step four, according to different objective functions, comprising the following steps:
G1. when entire tactful network completes the generation acted to sequence, the cutting of entire sequence is also just completed, fixed first The delay reward function of adopted entire sequence strategy network are as follows:
Wherein ytIt is input XLIt in the true place marks of time step t, is indicated with one-hot coding, L' indicates session in sequence Number, γ be measure two parts reward hyper parameter, Q is a certain constant.It is assumed that the moderate length of one section of session, therefore propose Unimodal function f (x)=x+Q/x,When get minimum valueThe length of artificial one section of session of hopeIt can be relatively good.Section 2 by replacing reward function can propose different limitations to the length of session;
G2. in definition strategy network a sequence gradient updating formula are as follows:
WhereinIt is the parameter in tactful network;
G3. definition intersects the objective function that entropy function is training sorter network:
Wherein Θ represents parameter all in sorter network, and β is the hyper parameter for weighing two parts loss.
In the step four, the parameter of interim training network model, comprising the following steps:
H1. pre-training sorter network: initial policy π is applied0And training sample, using backpropagation, with fixed in aforementioned (3) The cross entropy loss function function of justice is to minimize target, updates the parameter in sorter network;
H2. pre-training strategy network: keeping the parameter constant in sorter network, passes through and updates ladder defined in aforementioned (2) Spend more new formula, the parameter in Training strategy network;
H3. joint training: the parameter in joint training whole network, until loss restrains.
In the step five, test set is predicted using the network model based on level Recognition with Recurrent Neural Network and intensified learning In the lower probable behavior of user, comprising the following steps:
I1. splicing user indicates the sum of the output with last time step of level Recognition with Recurrent Neural Network:
I2. full articulamentum prediction target is added:
WhereinIt is distributed, is indicated with one-hot coding, W for the place of predictiono,boIt is the parameter of sorter network.
Compared with prior art, the present invention includes: with beneficial effect
(1) user's time series is modeled using level Recognition with Recurrent Neural Network, it is contemplated that user's row of different levels To indicate, important sequence information can be efficiently extracted out;
(2) it is reasonable to consider that the connection between sequence front and back comes for cutting of the Utilization strategies e-learning to user's time series data Dividing sequence, while various constraints being taken into account in reward function;
(3) existing defect when the artificial dividing sequence of effective solution is short session is suitable for institute as that can not provide one There is the problem of window size of user.
Detailed description of the invention
Fig. 1 is the flow diagram of user's timing behavior cutting prediction technique of the present invention.
Fig. 2 is the frame diagram of whole network model in one embodiment of the invention.
Specific embodiment
In conjunction with following specific embodiments and attached drawing, the present invention is described in further detail.Implement process of the invention, Condition, experimental method etc. are among the general principles and common general knowledge in the art, this hair in addition to what is specifically mentioned below It is bright that there are no special restrictions to content.It should be pointed out that those skilled in the art, not departing from present inventive concept Under the premise of, various modifications and improvements can be made.These are all within the scope of protection of the present invention.
It is pre- that the present invention provides a kind of user's timing behavior automatic segmentation for recycling neural network and intensified learning based on level Survey method, flow chart as shown in Figure 1, method includes the following steps:
Step 1: choosing data set, and cutting data are training set, verifying collection and test set after pre-processing to data;
Step 2: the one-hot coding of user and timing behavior higher-dimension are indicated, are converted to low-dimensional using embedded technology Input of the dense vector as model;
Step 3: modeling time series data using level Recognition with Recurrent Neural Network, when Utilization strategies network generates each The action director of spacer step whether to sequence carry out cutting, then using sorter network complete to sequence future time walking be it is pre- It surveys;
Step 4: training pattern parameter, using training sample, according to the optimization network model of different target function stage Parameter, and utilize verifying the set pair analysis model parameter carry out tuning;
Step 5: using based on the user in the network model of level Recognition with Recurrent Neural Network and intensified learning prediction test set Next probable behavior.
It is finer, it is carried out as follows by taking Gowalla data set as an example using Python firstly, choosing data set Processing:
A1. location sequence data are temporally stabbed by user and is sorted from the distant to the near;
A2. operation is filtered to data wherein infrequently, deletes the user for occurring to be less than 10 behaviors, deleted and occur Less than the article of 5 user behaviors;
A3. a time window is selected, will be recorded as the sequence slit mode of cutting foundation as policy network The initial policy π of network0
A4. to each user, using the last one place of time series data as test set, the penultimate of time series data Collect as verifying, remaining is accordingly to be regarded as training set.
By calling some packets in Tensorflow and Python, the processing of the input of model, including following step are completed It is rapid:
B1. data encoding: note shares N number of user and M place, using one-hot coding, i.e., is indicated with N-dimensional sparse vector User's set, the unique characteristic dimension of user are denoted as 1, remaining is all 0, are similarly applied to place;
B2. data insertionization: the numerical value vector that N-dimensional user vector is mapped to another low-dimensional using embedded technology is empty Between, as the input of model later, remember that transformed user vector set expression is U={ u1,u2,…,uN, place vector set Conjunction is expressed as P={ p1,p2,…,pM}。
Next, being completed using the variation of GRU module and tensor in Tensorflow to level Recognition with Recurrent Neural Network Building, comprising the following steps:
C1. sequence level Recognition with Recurrent Neural Network:
C11. list entries length is the location sequence of L, is denoted as
C12. it is calculated by Recognition with Recurrent Neural NetworkObtain the defeated of each time step of sequence level Out, it is denoted as
C2. session-level Recognition with Recurrent Neural Network:
C21. according to cutting strategy π, from sequence level Recognition with Recurrent Neural Network export selection cutting time step it is corresponding as a result, As the input of session-level Recognition with Recurrent Neural Network, length is | π |, it is denoted as
C22. it is calculated by Recognition with Recurrent Neural NetworkObtain session-level Recognition with Recurrent Neural Network Output, is denoted as
C3. according to time step be unfolded export: be by length | π | outputAccording to cutting strategy π, exhibition Open the output for being L for length.Specifically, the output of latter session stage is unfolded by the hidden state of the final time step of previous session It obtains, the output of first session is full null vector.
Using the built-in function construction strategy network of Tensorflow, and generate with it movement of each time step, with The movement a generated when time step ttCiting, includes the following steps:
D1. function of state s is calculatedt:
WhereinIndicate the splicing of vector,WithRespectively Recognition with Recurrent Neural Network is in sequence level and session-level Output in time step t;
D2. calculative strategy function π:
π(at|st;Θ)=σ (Wπ*st+bπ),
Wherein Wπ,bπFor the parameter of tactful network.In the training process, a is actedtValue by tactful π probability value sample institute , in test, movement a is depended on
Using the full articulamentum etc. in Tensorflow, the behavior of sorter network forecasting sequence future time step is constructed, with Time step t predicted time walks t+1 citing, comprising the following steps:
E1. splicing user indicates the sum of the output with level Recognition with Recurrent Neural Network:
E2. full articulamentum is added on it:
Wherein Wo,boIt is the parameter of sorter network, dimension consistent with place number is M,Indicate that the user of prediction exists The place that time step t+1 is gone to, is indicated with one-hot coding.
By calling the majorized functions such as the backpropagation in Tensorflow, according to different objective functions, training network The parameter of model, comprising the following steps:
F1. pre-training sorter network: initial policy π is applied0And training sample, using backpropagation, to intersect entropy function For training sorter network objective function:
Wherein Θ represents parameter all in sorter network, and β is the hyper parameter for weighing two parts loss.On minimizing Formula updates the parameter in network;
F2. the parameter constant in sorter network, the delay reward of entire sequence strategy network pre-training strategy network: are kept Function are as follows:
Wherein ytIt is input XLIt in the true place marks of time step t, is indicated with one-hot coding, L' indicates session in sequence Number, γ be measure two parts reward hyper parameter, Q is a certain constant.Q=100, replacement reward can be set in practice The Section 2 of function can propose different limitations to the length of session.Update the gradient updating formula of a sequence in tactful network Are as follows:
WhereinIt is the parameter in tactful network, with this Training strategy network;
F3. joint training: the parameter in joint training whole network, until loss restrains.
Utilize parameter W trained in sorter networko,boDeng, predict the lower probable behavior of user in test set, including Following steps:
G1. splicing user indicates the sum of the output with last time step of level Recognition with Recurrent Neural Network:
G2. full articulamentum prediction target is added:
WhereinIt is distributed, is indicated with one-hot coding, Wo, bo is the parameter of sorter network for the place of prediction.
In practice, between model layer, it is also an option that property the following steps are included: during model training, use The case where two norm canonicals of dropout network and parameter limit parameter, prevent over-fitting.
The frame diagram of whole network model in one embodiment of the invention, as shown in Figure 2:
H1. to the sequence inputting of user, sequence level Recognition with Recurrent Neural Network and meeting level Recognition with Recurrent Neural Network: are utilized respectively The sequence information table that words rank Recognition with Recurrent Neural Network extracts different levels rank shows;
H2. tactful network: receiving user indicates and the output of level Recognition with Recurrent Neural Network, is calculated using full articulamentum whole The delay of sequence is rewarded, and the parameter in gradient updating network is utilized;
H3. sorter network: receiving user indicates and the output of level Recognition with Recurrent Neural Network, using full articulamentum complete to The behavior prediction of family future time step updates the parameter in network using back-propagation algorithm.
The method of the present invention can be applicable to other users timing behavior, as user buys commodity sequence, user listens to sound Happy sequence, implements and present embodiment is essentially identical, and detailed process is no longer described in detail.
Parameter in the above embodiment of the present invention is determined according to experimental result, that is, tests different parameter combinations, choosing It takes and collects upper evaluation index preferably one group of parameter in verifying, evaluation obtains result on test set.It, can root in actual test The purpose of the present invention can also be realized by carrying out appropriate adjustment to above-mentioned parameter according to demand.
Protection content of the invention is not limited to above embodiments.Under the spirit and scope without departing substantially from present inventive concept, Various changes and advantages that will be apparent to those skilled in the art are all included in the present invention, and are with appended claims Protection scope.

Claims (13)

1. a kind of user's timing behavior automatic segmentation prediction technique based on level circulation neural network and intensified learning, feature It is, described method includes following steps:
Step 1: choosing data set, and cutting data are training set, verifying collection and test set after pre-processing to data;
Step 2: indicating the one-hot coding of user and timing behavior higher-dimension, is converted to the dense of low-dimensional using embedded technology Input of the vector as model;
Step 3: modeling time series data using level Recognition with Recurrent Neural Network, and Utilization strategies network generates each time step Action director whether to sequence carry out cutting, then using sorter network complete be to sequence future time walking prediction;
Step 4: training pattern parameter, using training sample, according to the ginseng of the interim optimization network model of different target function Number, and tuning is carried out using verifying the set pair analysis model parameter;
Step 5: using based under the user in the network model of level Recognition with Recurrent Neural Network and intensified learning prediction test set one Probable behavior.
2. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that user's timing behavior include user register behavior, user buy commodity behavior, user Music behavior is listened in webpage clicking behavior, user, all behavioral datas generallyd use for this field.
3. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step 1, the data set includes: Gowalla data set, Foursquare data One or more data sets in collection, Amazon data set, all overt behavior data sets generallyd use for this field.
4. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step 1, it is described data are pre-processed the following steps are included:
A1. behavior sequence data are temporally stabbed by user and is sorted from the distant to the near;
A2. operation is filtered to data wherein infrequently;
A3. a time window is selected, will be recorded as the sequence slit mode of cutting foundation as tactful network Initial policy π0
5. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step one, cutting data are training set, verifying collection and test set are as follows: to each use Family, using the last one place of time series data as test set, as verifying collection, remaining makees the penultimate of time series data For training set.
6. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step two, the input for obtaining model the following steps are included:
B1. data encoding: note shares N number of user and M place, using one-hot coding, i.e., indicates user with N-dimensional sparse vector Set, the unique characteristic dimension of user are denoted as 1, remaining is all 0, are similarly applied to place;
N-dimensional user vector: being mapped to the numerical value vector space of another low-dimensional using embedded technology by b2. data insertionization, is made The input of model after for it remembers that transformed user vector set expression is U={ u1, u2..., uN, place vector collection table It is shown as P={ p1, p2..., pM}。
7. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step three, Recognition with Recurrent Neural Network refers to but is not limited to gating cycle unit networks, Using shot and long term memory network replacement also can, by taking time step t as an example, remember xtInput when for time step t, specific calculating process The following steps are included:
C1. it calculates and updates door zt:
zt=σ (Wz·[ht-1, xt]+bz),
C2 calculates resetting door rt:
rt=σ (Wr·[ht-1, xt]+br),
C3. cryptomnesia state is calculated
C4. hidden state h is calculatedt:
Wherein, σ is sigmoid function, indicates Hadamard product, and [] indicates the splicing of vector, * representing matrix multiplication, Wz, Wr,bz, br, bhIt is all the parameter that model can learn.
8. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step three, time series data is modeled using level Recognition with Recurrent Neural Network, with User indicates ukFor, include the following steps:
D1. sequence level Recognition with Recurrent Neural Network:
D11. list entries length is the location sequence of L, is denoted as
D12. it is calculated by the Recognition with Recurrent Neural Network in claim 6, obtains the output of each time step, be denoted as
D2. session-level Recognition with Recurrent Neural Network:
D21. according to cutting strategy π, select cutting time step corresponding from the output of sequence level Recognition with Recurrent Neural Network as a result, as The input of session-level Recognition with Recurrent Neural Network, length is | π |, it is denoted as
D22. it is calculated by the Recognition with Recurrent Neural Network in claim 6, obtains the output of session-level Recognition with Recurrent Neural Network, be denoted as
D3. according to time step be unfolded export: be by length | π | outputAccording to cutting strategy π, expand into Length is the output of L.
9. user's timing behavior automatic segmentation according to claim 1 based on level circulation neural network and intensified learning Prediction technique, which is characterized in that in the step three, Utilization strategies network generates the movement of each time step, in the time The movement a generated when walking ttCiting, includes the following steps:
E1. definition status function st:
WhereinIndicate the splicing of vector,WithRespectively Recognition with Recurrent Neural Network is in sequence level and session-level in the time Walk output when t;
E2. motion space a is definedt:
at={ 1,0 },
Wherein 1 expression current behavior belongs to current session, and 0 expression is not belonging to current session;
E3. definition strategy function π:
π(at|st;Θ)=σ (Wπ*st+bπ),
Wherein Wπ, bπFor the parameter of tactful network;In the training process, a is actedtValue by tactful π probability value sample gained, In test, movement a is depended on
10. user's timing behavior according to claim 1 based on level circulation neural network and intensified learning is cut automatically Divide prediction technique, which is characterized in that in the step three, the behavior walked using sorter network forecasting sequence future time, with Time step t predicted time walks t+1 citing, comprising the following steps:
F1. splicing user indicates the sum of the output with level Recognition with Recurrent Neural Network:
F2. full articulamentum is added on it:
Wherein Wo, boIt is the parameter of sorter network, dimension consistent with place number is M,Indicate the user of prediction in the time The place that step t+1 is gone to, tieing up sparse one-hot coding with M indicates.
11. user's timing behavior according to claim 1 based on level circulation neural network and intensified learning is cut automatically Divide prediction technique, which is characterized in that in the step four, according to different objective functions, comprising the following steps:
G1. when the generation that entire tactful network completion acts sequence, the delay reward function of entire sequence strategy network is defined Are as follows:
Wherein ytIt is input XLIt in the true place marks of time step t, is indicated with one-hot coding, the number of session in L ' expression sequence Mesh, γ are the hyper parameter for measuring two parts reward, and Q is a certain constant;
G2. in definition strategy network a sequence gradient updating formula are as follows:
WhereinIt is the parameter in tactful network;
G3. definition intersects the objective function that entropy function is training sorter network:
Wherein Θ represents parameter all in sorter network, and β is the hyper parameter for weighing two parts loss.
12. user's timing behavior according to claim 11 based on level circulation neural network and intensified learning is cut automatically Divide prediction technique, which is characterized in that in the step four, the parameter of interim training network model, comprising the following steps:
H1. pre-training sorter network: initial policy π is applied0And training sample, using backpropagation, with function described in step g3 To minimize target, the parameter in sorter network is updated;
H2. pre-training strategy network: keeping the parameter constant in sorter network, passes through that update gradient updating described in step g2 public Formula, the parameter of Training strategy network;
H3. joint training: the parameter in joint training whole network, until loss restrains.
13. user's timing behavior according to claim 1 based on level circulation neural network and intensified learning is cut automatically Divide prediction technique, which is characterized in that in the step five, utilize the network based on level Recognition with Recurrent Neural Network and intensified learning The lower probable behavior of user in model prediction test set, comprising the following steps:
I1. splicing user indicates the sum of the output with last time step of level Recognition with Recurrent Neural Network:
I2. full articulamentum prediction target is added:
WhereinIt is distributed for the place of prediction, tieing up sparse one-hot coding with M indicates, Wo, boIt is the parameter of sorter network.
CN201910279004.0A 2019-04-09 2019-04-09 Automatic segmentation prediction method for user time sequence behavior Active CN110110372B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910279004.0A CN110110372B (en) 2019-04-09 2019-04-09 Automatic segmentation prediction method for user time sequence behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910279004.0A CN110110372B (en) 2019-04-09 2019-04-09 Automatic segmentation prediction method for user time sequence behavior

Publications (2)

Publication Number Publication Date
CN110110372A true CN110110372A (en) 2019-08-09
CN110110372B CN110110372B (en) 2023-04-18

Family

ID=67483968

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910279004.0A Active CN110110372B (en) 2019-04-09 2019-04-09 Automatic segmentation prediction method for user time sequence behavior

Country Status (1)

Country Link
CN (1) CN110110372B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160484A (en) * 2019-12-31 2020-05-15 腾讯科技(深圳)有限公司 Data processing method and device, computer readable storage medium and electronic equipment
CN112001536A (en) * 2020-08-12 2020-11-27 武汉青忆辰科技有限公司 High-precision finding method for minimal sample of mathematical capability point defect of primary and secondary schools based on machine learning
CN112525213A (en) * 2021-02-10 2021-03-19 腾讯科技(深圳)有限公司 ETA prediction method, model training method, device and storage medium
CN114417817A (en) * 2021-12-30 2022-04-29 中国电信股份有限公司 Session information cutting method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787100A (en) * 2016-03-18 2016-07-20 浙江大学 User session recommendation method based on deep neural network
CN108595602A (en) * 2018-04-20 2018-09-28 昆明理工大学 The question sentence file classification method combined with depth model based on shallow Model
CN108647251A (en) * 2018-04-20 2018-10-12 昆明理工大学 The recommendation sort method of conjunctive model is recycled based on wide depth door
US20180342004A1 (en) * 2017-05-25 2018-11-29 Microsoft Technology Licensing, Llc Cumulative success-based recommendations for repeat users
CN109241431A (en) * 2018-09-07 2019-01-18 腾讯科技(深圳)有限公司 A kind of resource recommendation method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105787100A (en) * 2016-03-18 2016-07-20 浙江大学 User session recommendation method based on deep neural network
US20180342004A1 (en) * 2017-05-25 2018-11-29 Microsoft Technology Licensing, Llc Cumulative success-based recommendations for repeat users
CN108595602A (en) * 2018-04-20 2018-09-28 昆明理工大学 The question sentence file classification method combined with depth model based on shallow Model
CN108647251A (en) * 2018-04-20 2018-10-12 昆明理工大学 The recommendation sort method of conjunctive model is recycled based on wide depth door
CN109241431A (en) * 2018-09-07 2019-01-18 腾讯科技(深圳)有限公司 A kind of resource recommendation method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BAL´AZS HIDASI: "SESSION-BASED RECOMMENDATIONS WITH RECURRENT NEURAL NETWORKS", 《ICLR 2016》 *
DONGYANG ZHAO: "Deep Hierarchical Reinforcement Learning Based Recommendations via Multi-goals Abstraction", 《ARXIV:1903.09374V1 [CS.LG]》 *
MINMIN CHEN: "Top-K Off-Policy Correction for a REINFORCE Recommender System", 《 PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL》 *
王博立: "一种基于循环神经网络的古文断句方法", 《北京大学学报(自然科学版)》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160484A (en) * 2019-12-31 2020-05-15 腾讯科技(深圳)有限公司 Data processing method and device, computer readable storage medium and electronic equipment
CN111160484B (en) * 2019-12-31 2023-08-29 腾讯科技(深圳)有限公司 Data processing method, data processing device, computer readable storage medium and electronic equipment
CN112001536A (en) * 2020-08-12 2020-11-27 武汉青忆辰科技有限公司 High-precision finding method for minimal sample of mathematical capability point defect of primary and secondary schools based on machine learning
CN112001536B (en) * 2020-08-12 2023-08-11 武汉青忆辰科技有限公司 High-precision discovery method for point defect minimum sample of mathematical ability of middle and primary schools based on machine learning
CN112525213A (en) * 2021-02-10 2021-03-19 腾讯科技(深圳)有限公司 ETA prediction method, model training method, device and storage medium
CN112525213B (en) * 2021-02-10 2021-05-14 腾讯科技(深圳)有限公司 ETA prediction method, model training method, device and storage medium
CN114417817A (en) * 2021-12-30 2022-04-29 中国电信股份有限公司 Session information cutting method and device

Also Published As

Publication number Publication date
CN110110372B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110110372A (en) A kind of user's timing behavior automatic segmentation prediction technique
CN112381581B (en) Advertisement click rate estimation method based on improved Transformer
CN109919316A (en) The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN110728541B (en) Information streaming media advertising creative recommendation method and device
CN109992710A (en) Clicking rate predictor method, system, medium and calculating equipment
CN102708130B (en) Calculate the easily extensible engine that fine point of user is mated for offer
CN110046304A (en) A kind of user's recommended method and device
CN109241440A (en) It is a kind of based on deep learning towards implicit feedback recommended method
CN110083700A (en) A kind of enterprise's public sentiment sensibility classification method and system based on convolutional neural networks
CN106547887A (en) Method and apparatus is recommended in search based on artificial intelligence
CN110163723A (en) Recommended method, device, computer equipment and storage medium based on product feature
CN111310063A (en) Neural network-based article recommendation method for memory perception gated factorization machine
CN108763556A (en) Usage mining method and device based on demand word
CN108665311A (en) A kind of electric business user's time varying characteristic Similarity measures recommendation method based on deep neural network
CN110175857B (en) Method and device for determining optimal service
CN115186097A (en) Knowledge graph and reinforcement learning based interactive recommendation method
CN112700274A (en) Advertisement click rate estimation method based on user preference
CN108846695A (en) The prediction technique and device of terminal replacement cycle
CN111241394A (en) Data processing method and device, computer readable storage medium and electronic equipment
CN108920521A (en) User's portrait-item recommendation system and method based on pseudo- ontology
CN112765461A (en) Session recommendation method based on multi-interest capsule network
CN108052670A (en) A kind of recommendation method and device of camera special effect
CN106844788A (en) A kind of library's intelligent search sort method and system
CN110209933A (en) A kind of biasing tensor resolution method based on regression tree contextual feature autocoding
CN116308109A (en) Enterprise policy intelligent recommendation and policy making system based on big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant