Specific implementation mode
With reference to the attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete
Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based on this
The embodiment of invention, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, belongs to protection scope of the present invention.
The embodiment of the present invention provides a kind of sequence of recommendation method based on user behavior difference modeling, as shown in Figure 1, it is led
Include the following steps:
Step 1, the historical behavior information for obtaining user.
Each user can leave a series of log recording from the background when browsing is in line platform, these records have
The commodity relevant operations such as specific sequential relationship, including user's browsing, click, addition shopping cart, collection and purchase.These data
It can be collected directly from online shopping platform or online service provider.
In the embodiment of the present invention, the historical behavior information of acquired user is the data of interaction commodity sequence form, is used
The interaction commodity sequence of family u is expressed as:X therein indicates commodity, under be designated as commodity
Serial number, b indicate user's operation behavior, be one-hot vector, vectorial length is the quantity of type of interaction.
Step 2 calculates product features vector according to the historical behavior information of user.
In the embodiment of the present invention, by modeling the commodity sequence relation in user behavior, the Skip- based on negative sampling is built
Gram models generate the feature vector of commodity;Main process is as follows:
According to the interaction commodity sequence of user uProduct features vector is set up,
It is intended to maximize following target:
Wherein, N is interaction commodity sequenceLength, what p was indicated is softmax functional forms, definition be and xj、xi
Relevant probability, p (xj|xi) this form is referred to as softmax functions in professional domain, form is as follows:
Wherein, wiWith viIt is and commodity xiContext indicate corresponding latent variable and object vector;wjIt is and commodity
xjContext indicate corresponding latent variable;wk'It is and commodity xk'Context indicate corresponding latent variable;K''s
Value is 1 to N;
In order to mitigate the computation complexity of gradient, the following process of above formula is replaced:
Wherein, σ (r)=1/ (1+exp (- r)) is sigmoid functions, and E is the number of the negative sample to be drawn of each positive sample
Amount, here positive sample refer to and xiContext-sensitive commodity, the incoherent commodity that negative sample refers to just, the size of E can
To be set according to actual conditions or experience by user;
It is different to consider different commodity occurrence numbers, certain noise can be brought to above-mentioned negative sampling process, based on pair
The mode that commodity individual occurrence number is weighted, is again defined as above formula:
Wherein, Θ (xi) it is commodity xiThe frequency statistics occurred in interaction commodity sequence, the then target that commodity insertion characterizes
As maximize loss function:
Later, product features vector P is obtained by way of gradient declineu={ v1,v2,...,vN, wherein vjIndicate quotient
Product xjD dimensional feature vectors.
Step 3 carries out Series Modeling in conjunction with product features vector usage behavior difference modeling method, by two different
Neural network framework obtains the current demand and history preference of user.
After obtaining product features vector, difference behavior modeling can be using Continuous behavior as priori, to recommend
For the purpose of the project that target user most possibly accesses when accessing next time.The decision process of user is mainly by two factor shadows
It rings:Current motivation and history preference.More specifically, the current consumer motivation of user is dynamic in a short time, nearest
Fluctuation is also critically important for reflection Short-term characteristic.In view of all recent behaviors (such as clicking, collect, shopping cart, purchase)
It might mean that the current short-term motivation of user, it is dynamic that current consumption is presented using all types of recent behaviors in the present invention
Machine.On the other hand, for the history preference using user, not all types of behaviors can describe the preference of user.In order to
The long-term preference of user is modeled, the present invention only retains the potential preference that user is explicitly described from interactive history i.e. buying behavior
Behavior.In fact, the interactive process of user is a series of implicit feedbacks over time.Therefore, it is visited with from static mode
The conventional recommendation systems of rope user items interaction are different, and the next item down suggestion is handled by sequence modeling.Specifically, Wo Menshe
Two distinguishing behavior modeling processes are counted:It is current discriminatively to learn user for session behavior modeling and preference behavior modeling
Consumer motivation and preference steady in a long-term.In addition, on this basis, we have invented two kinds, and the deep-cycle based on LSTM is refreshing
Through network, to learn the arrangement of both motivations and preference behavior jointly.
First, conversate behavior modeling, product features vector Pu={ v1,v2,...,vNCorresponding interactive commodity sequence
ForIndicator function is defined as follows to determine commodity xiWhether current sessions behavior is met
Range:
DSBL(xi,xN)=Φ ((N-i)≤Ts);
Wherein, Φ (a) is a Boolean type function, when a is true, functional value 1, it is on the contrary then be 0;Ts indicates session
The control time of behavior walks, for controlling the length of session behavior;xNIt is current interaction commodity sequence Su bIn the last one commodity;
After definition initializes LSTM matrixes, in t-th of iteration step, the hiding layer state h of eachtUpdate,
With the hiding layer state h of a upper time stept-1And the product features vector v currently inputtedtWith behavior vector btIt is related;Wherein
Steps are as follows for update:
ht=ottanh(ct)
Wherein, it、ft、otInput gate, forgetting door, out gate in respectively t-th of iteration step;ctIt is network element
Memory module;btFor the user's operation behavior of t-th of the commodity of correspondence inputted in t-th of iteration step;Wvi、Whi、Wci、WbiIt is corresponding
For input gate itMiddle vt、ht-1、ct-1、btWeight;Wvf、Whf、Wcf、Wbf, correspond to forget door ftMiddle vt、ht-1、ct-1、btPower
Weight;Wvc、Whc、WbcCorrespond to v in memory modulet、ht-1、btWeight;Wvo、Who、Wco、WboCorrespond to out gate otMiddle vt、
ht-1、ct-1、btWeight;Input gate i is corresponded to respectivelyt, forget door ft, out gate ot, memory module ct
Deviation;htFor the output of current state;Tanh is hyperbolic tangent function.
Then the current purchasing demand of user is expressed as:
ΨSBL=hN;
During aforesaid operations, iterations are identical as commodity amount in interaction commodity sequence, i.e. t=1,2 ..., N, hN
The last one commodity of sequence xNOutput after input, i.e. n-th iteration step export.
Secondly, the history preference modeling for carrying out user, for each user's operation commodity-behavior to (vi,bi)∈Su b;
Its indicator function is expressed as:
DPBL(vi,bi)=Φ (bi∈P);
Wherein, P is the set of preference behavior, and main includes purchase, collection, addition shopping cart operation behavior;
The preference expression for learning user using two-way LSTM networks, in each time step of history preference modeling
There are two hidden layers to export, and for s-th of time step, wherein forward direction exportsIt is to be exported by its previous time stepWith work as
Preceding commodity-behavior is to (vs,bs) determined;Backward outputIt is to be exported by its latter time stepWith current quotient
Product-behavior is to (vs,bs) determined;Corresponding formula is as follows:
hs=ostanh(cs)
Wherein, is、fs、osThe input gate of respectively s-th time step forgets door, out gate;csIt is the note of network element
Recall module;bsFor the user's operation behavior of s-th of the commodity of correspondence inputted in s-th of iteration step;Wvi'、Whi'、Wci'、Wbi' right
It should be input gate isMiddle vs、hs-1、cs-1、bsWeight;Wvf'、Whf'、Wcf'、Wbf', correspond to forget door fsMiddle vs、hs-1、cs-1、
bsWeight;Wvc'、Whc'、Wbc' correspond to v in memory modules、hs-1、bsWeight;Wvo'、Who'、Wco'、Wbo' correspond to it is defeated
Go out osMiddle vs、hs-1、cs-1、bsWeight; Input gate i is corresponded to respectivelys, forget door fs, out gate
os, memory module csDeviation;hsFor the output of current state;If it is forward process, the output h of current statesAsIf it is backward process, the output h of current statesAs
By two-way LSTM networks, the preference that active user can be accessed to each time step characterizes vector:
Wherein,ForVector splicing, the average pond process that the history preference of user is expressed as:
Step 4, the current purchasing demand according to user and history preference, it is next to user interested by combination learning
Commodity predicted, and matched in commodity vector space, find with prediction result the most phase in commodity vector space
Close multiple commodity generate commercial product recommending sequence.
In the embodiment of the present invention, the current purchasing demand Ψ of user is combined by a full linking layerSBLWith history preference
ΨPBL, to which the predicted vector of the next interested commodity of user be calculated:
Wherein,WithCorresponding is the weight of current purchasing demand and history preference;Bias indicates model bias.
During model training, it is assumed that really the next interested commodity vector of user is:vT+1=(y1,
y2,...,yd);The loss function of model may be defined as:
Wherein, d is the dimension of vector.
Said program of the embodiment of the present invention records for the historical behavior of user, division difference is carried out according to time series
The sequence information of different user is embodied on the structure product features vector and user behavior difference modeling method of the present invention,
Product features vector is generated with commodity insertion characterizing method, and difference sequence is carried out to the different behaviors of commodity to user and is built
Mould, learns the current demand and history preference of user respectively, and interested commodity next to user are predicted.This method
The history preference of user is combined with current demand, the difference preference of commodity is built for expressed by user's difference behavior
Mould is dynamically learnt user's decision process by recurrent neural network, and then generates personalized sequence of recommendation to user, is made up
Existing method lacks dynamic and personalized certain drawback.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment can
By software realization, the mode of necessary general hardware platform can also be added to realize by software.Based on this understanding,
The technical solution of above-described embodiment can be expressed in the form of software products, the software product can be stored in one it is non-easily
In the property lost storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.), including some instructions are with so that a computer is set
Standby (can be personal computer, server or the network equipment etc.) executes the method described in each embodiment of the present invention.
The foregoing is only a preferred embodiment of the present invention, but scope of protection of the present invention is not limited thereto,
Any one skilled in the art is in the technical scope of present disclosure, the change or replacement that can be readily occurred in,
It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Subject to enclosing.