CN110287412A

CN110287412A - Content recommendation method, recommended models generation method, equipment and storage medium

Info

Publication number: CN110287412A
Application number: CN201910498647.4A
Authority: CN
Inventors: 原发杰; 何向南; 黄帆; 涂建超; 熊健; 何秀强
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-06-10
Filing date: 2019-06-10
Publication date: 2019-09-27
Anticipated expiration: 2039-06-10
Also published as: CN110287412B

Abstract

Present disclose provides a kind of content recommendation method, recommended models generation method, equipment and storage mediums.The recommended models generation method includes: to obtain the historical content sequence of user, wherein the historical content sequence is the time series of the used multiple contents of the user；One or more contents in the multiple content that the historical content sequence includes are substituted for scheduled item respectively, to obtain the list entries as training data；The output sequence identified as the training data of the training data is generated according to the one or more of contents being replaced, wherein the output sequence is the time series for the one or more of contents being replaced；Neural network is trained using multiple training datas and the corresponding training data mark, to obtain the recommended models for commending contents.By embodiments herein, accurate recommended models can be generated.

Description

Content recommendation method, recommended models generation method, equipment and storage medium

Technical field

The invention relates to computers and field of communication technology, and in particular to a kind of content recommendation method recommends mould Type generation method, equipment and computer readable storage medium.

Background technique

In recent years, timing proposed algorithm had caused the extensive concern of academia and business circles.Particularly in short-term Interior user interest changes bigger recommendation scene, such as short-sighted frequency, and music and news are recommended, and user can within a few hours Hundreds of contents (i.e. item, such as short-sighted frequency, music and news) can be browsed.It is mentioned using RNN (Recognition with Recurrent Neural Network) addend evidence The recommended models training method of (data augmentation) means of liter has become the mode of industry mainstream.Conventional at present Data lifting means are usually largely that sequence (the i.e. sequence of content clicked of user's history is clicked using existing user's history Column) generate many short son click sequences.For example, if video ID number is when original viewing 1,5,7,2,22,7,34,11, 13,78 }, proposed algorithm is in order to make full use of the temporal aspects of the data, it will usually progress data promotion (data first Augmentation it) operates, to generate following subsequence (subsequence):

Subsequence 1:{ 0,1,5,7,2,22,7,34,11,13 }

Subsequence 2:{ 0,0,1,5,7,2,22,7,34,11 }

Subsequence 3:{ 0,0,0,1,5,7,2,22,7,34 }

…

Subsequence n:{ 0,0,0,0,0,0,0,1,5,7 }

There are several apparent defects for this data hoisting way.Firstly, generating additional subsequence can occupy largely Computer resource mentions if original series length is 200 according to above data especially for the industrial data collection of hundred million ranks The mode of liter, training data will increase nearly 200 times.Secondly, this very short subsequence destroys the integrality of data, often not The performance that can be optimal.Again, to the training of each of sequence content item (item), can only utilize on the left of it (or Past) click information, lose on the right side of this information of (i.e. following), and the actually following sequence of clicking equally includes important Language ambience information.

Summary of the invention

Embodiments herein provides a kind of content recommendation method, recommended models generation method, equipment and computer-readable Storage medium, thus without generating accurate recommended models in the case where generating subsequence to carry out data promotion.

According to the embodiment of the present application in a first aspect, disclose a kind of recommended models generation method for commending contents, Comprising:

The historical content sequence of user is obtained, wherein the historical content sequence is the used multiple contents of the user Time series；

One or more contents in the multiple content that the historical content sequence includes are substituted for respectively predetermined , to obtain the list entries as training data；

The defeated of the training data mark as the training data is generated according to the one or more of contents being replaced Sequence out, wherein the output sequence is the time series for the one or more of contents being replaced；

Neural network is trained using multiple training datas and the corresponding training data mark, with To the recommended models for commending contents.

According to the second aspect of the embodiment of the present application, a kind of content recommendation method is disclosed comprising:

A scheduled item is added as content to be solved in the tail portion of the historical content sequence, obtains list entries；

The list entries is inputted into trained recommended models, to obtain as the output of the recommended models wait push away Content is recommended, wherein the content to be recommended is corresponding with the content to be solved, indicates the content to recommend to the user.

According to the third aspect of the embodiment of the present application, a kind of recommended models generating means for commending contents are disclosed, Comprising:

Module is obtained, is configured as: obtaining the historical content sequence of user, wherein the historical content sequence is described The time series of the used multiple contents of user；

Generation module is inputted, is configured as: one in the multiple content for including by the historical content sequence Or multiple contents are substituted for scheduled item respectively, to obtain the list entries as training data；

Generation module is exported, is configured as: being generated according to the one or more of contents being replaced and is used as the instruction The output sequence for practicing the training data mark of data, wherein the output sequence is the one or more of contents being replaced Time series；

Training module is configured as: using multiple training datas and the corresponding training data mark pair Neural network is trained, to obtain the recommended models for commending contents.

According to the fourth aspect of the embodiment of the present application, a kind of content recommendation device is disclosed comprising:

Generation module is inputted, is configured as: adding a scheduled item in the tail portion of the historical content sequence and is used as wait ask Content is solved, list entries is obtained；

Recommend generation module, be configured as: the list entries is inputted into trained recommended models, using obtain as The content to be recommended of the output of the recommended models indicates wherein the content to be recommended is corresponding with the content to be solved Content to recommend to the user.

According to the 5th of the embodiment of the present application the aspect, a kind of calculating equipment is disclosed comprising processor and memory, Computer program is stored on the memory, the processor is configured when executing the computer program on the memory To realize such as in any of embodiment of the method for first or second aspect.

According to the 6th of the embodiment of the present application the aspect, a kind of computer readable storage medium is disclosed, meter is stored thereon with Calculation machine program, the computer program are realized when being executed by processor such as in the embodiment of the method for first or second aspect Any one.

Embodiments herein provide technical solution can have it is following the utility model has the advantages that

In the one or more of each embodiment of the application, pass through one or more for including by the historical content sequence of user A content replaces with specific scheduled item, will treated in this way historical content so that these content items being replaced are occluded Sequence is identified as training data and using the content being replaced as corresponding training data, is trained to neural network, from And accurate recommended models can be generated in the case where carrying out data promotion without subsequence.In addition, when due to training Partial content item in training data is covered and (replaces with scheduled item), so that neural network must be taken into consideration and be hidden in training Firmly context (context) information of item, to not only consider that this is occluded the past information of item, it is also considered that the letter in its future Breath, so that recommended models are more accurate.

Other characteristics and advantages of the embodiment of the present application will be apparent from by the following detailed description, or partially by The practice of the embodiment of the present application and acquistion.

It should be understood that the above general description and the following detailed description are merely exemplary, this can not be limited Application.

Detailed description of the invention

Be described in detail its example embodiment by referring to accompanying drawing, above and other target of the embodiment of the present application, feature and Advantage will become apparent.The drawings herein are incorporated into the specification and forms part of this specification, and shows Meet embodiments herein, and is used to explain the principle of the application together in specification.

Fig. 1 shows the schematic flow diagram of the recommended models generation method for commending contents according to the embodiment of the present application.

Fig. 2 shows the signal frameworks for the neural network in the recommended method and modeling method according to the embodiment of the present application Figure.

The residual error of neural network in NextitNet algorithm and according to the embodiment of the present application is shown respectively in Fig. 3 A and 3B The schematic diagram of block design.

Fig. 4 shows the schematic flow diagram of the content recommendation method according to one embodiment of the application.

Fig. 5 shows the signal composition block diagram of the recommended models generating means 500 according to one exemplary embodiment of the application.

Fig. 6 shows the signal composition block diagram of the content recommendation device 600 according to one exemplary embodiment of the application.

Fig. 7 shows the signal composition block diagram according to the calculating equipment shown in one exemplary embodiment of the application.

Specific embodiment

Example embodiment is described more fully with reference to the drawings.However, example embodiment can be with a variety of shapes Formula is implemented, and is not understood as limited to example set forth herein；Make the application real on the contrary, providing these example embodiments The description for applying example will be more full and complete, and the design of example embodiment is comprehensively communicated to those skilled in the art Member.Attached drawing is only the schematic illustrations of the embodiment of the present application, is not necessarily drawn to scale.Identical appended drawing reference table in figure Show same or similar part, thus repetition thereof will be omitted.

In addition, described feature, structure or characteristic can be incorporated in one or more examples in any suitable manner In embodiment.In the following description, many details are provided to provide the example embodiment party to the embodiment of the present application Formula is fully understood.It will be appreciated, however, by one skilled in the art that the technical solution of the embodiment of the present application can be practiced and omitted It is one or more in the specific detail, or can be using other methods, constituent element, step etc..In other cases, no Known features, method, realization or operation is shown or described in detail to avoid a presumptuous guest usurps the role of the host and makes each of the embodiment of the present application Aspect thickens.

Some block diagrams shown in the drawings are functional entitys, not necessarily must be with physically or logically independent entity phase It is corresponding.These functional entitys can be realized using software form, or in one or more hardware modules or integrated circuit in fact These existing functional entitys, or these functions reality is realized in heterogeneous networks and/or processor device and/or microcontroller device Body.

Most dialogue-based recommendation (session-based recommendation) algorithms are adopted in the prior art With RNN model, wherein the work of earliest period comes from GRURec (the gating cycle unit of Telefonic (Telefonica SA) company Recommend, (utilizes RNN referring to Session-based Recommendations with Recurrent Neural Networks Dialogue-based recommendation), 2016ICLR, Bal á zs Hidasi etc.).The relevant recommended models of RNN achieved in recent years compared with Big success.

There are two obvious shortcomings for session proposed algorithm (such as GRURec) based on RNN: if (1) length of session compared with Long, such as in short video recommendations scene, user in one day time can watch thousands of videos, and (each video of trill is averagely long Degree is about 15s), then RNN is very easy to the problems such as gradient explosion occur or disappearing when modeling.Even LSTM (shot and long term note Recall) the problems such as modeling with GRU (gating cycle unit), certain suboptimum is still had to very long sequence.(2) under RNN The output of one content item will be dependent on the hiding factor (hidden factors) of last output, therefore is difficult with the modern times The parallel computation strategy of calculating, such as GPU (graphics processing unit).Usual RNN acceleration effect at GPU is undesirable.

Recently, it is suggested using the proposed algorithm of CNN (convolutional neural networks), such as Caser and NextitNet.It is same with this When, the proposed algorithm based on attention (attention mechanism) is also suggested, such as based on Transformer (transformation) model Proposed algorithm (Self-Attentive Sequential Recommendation (from pay attention to sequence of recommendation), 2018ICDM, Wang-Cheng Kang etc.).

The Caser algorithm of Simon Fraser (simon phenanthrene Sha) university exploitation is (referring to Personalized Top-N Sequential Recommendation via Convolutional Sequence Embedding,2018 WSDM, Jiaxi Tang etc.) with Tencent company exploitation NextitNet algorithm (referring to A Simple Convolutional Generative Network for Next Item Recommendation, 2019WSDM, Fajie Yuan etc.) it is current In session-based recommendation (dialogue-based or timing recommendation) technology based on CNN state-of-the-art two A proposed algorithm.The data that Caser and NextitNet algorithm carry out in the training process are promoted as follows, wherein "=> " table Show prediction.

Caser:{ 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 }=> 16

{ 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14 }=> 15

{ 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13 }=> 14

…

{ 0,0,0,0,0,0,0,0,0,0,0,0,1,2,3 }=> 4

NextitNet:{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}

=> { 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16 }

Caser is suggested on WSDM (web search and data mining international conference) in 2018.Caser proposes to use CNN substitution RNN does session or timing (i.e. time series) is recommended, and uses strategies such as max pooling (maximum ponds) Increase visible range (receptive field), however there are the following problems seriously affects its performance: (1) Caser by Caser Max pooling mechanism be lost a large amount of timing information, thus be necessarily dependent upon generate the mode of more subsequence into Row training, so as to cause very more memory spaces, and the very short session in subsequence is occupied in training stage subsequence So that final recommendation effect is difficult to be optimal；(2) although Caser belongs to shallow structure using convolutional neural networks, It is difficult to capture more complex content item (item) relationship；(3) past user preference information is only used only in the Caser training stage, Following preference information is not made full use of.

By the inspiration of Caser, NextitNet is in WSDM 2019 by Tencent (Tencent), Telefonic, Ge Lasi Brother university and National University of Singapore propose.Identical as Caser, NextitNet is also using convolutional neural networks, still NextitNet has multitiered network structure, while being different from Caser when training, and NextitNet does not need to generate additional sub- sequence Column data, and the point of interest of user can be sufficiently captured according to intersection record before.But as Caser, For each current item of NextitNet just with past user interest when training, training process is unidirectional；This Outside, the residual error network of NextitNet is relatively simple.

In embodiments herein, present inventor proposes a kind of based on cloze test (gap-filling Test data hoisting way).In addition in some embodiments, in order to the utilization longer click sequence of user, the application Inventor propose using two-way stacking empty convolutional neural networks (dilated cnn) substitute RNN model so that according to this Apply for that the recommended models of embodiment have parallel ability more stronger than RNN at GPU, and there is higher recommendation accuracy.

Below to according to the recommended models modeling pattern of the embodiment of the present application and the modeling pattern of Caser, NextitNet It is exemplarily described respectively.

The modeling pattern of Caser:

For given properties collection, the probability of occurrence of the last one content is maximized.It is mathematically represented as: max { p (x_n, x₁, x₂..., x_n-1)}。

As discussed above, this training method cannot make full use of { x₁, x₂..., x_n-1Between dependence, especially It is when n is bigger, therefore when using Caser algorithm, it is common practice to according to { x₁, x₂..., x_n-1Regenerate Several subsequences, such as { 0 ..., 0, x₁x₂, x₃, { 0 ..., 0, x₂, x₃, x₄Etc..

The modeling pattern of NextitNet:

For given properties collection, the Joint Distribution of the content array is maximized (that is, the probability of occurrence of i-th of content It is that i-1 content is maximum under conditions of occurring in front), and it is broken down into the product of conditional probability.It is mathematically represented as:

According to the modeling pattern of the embodiment of the present application:

Given content (such as video) is gathered, using cloze test strategy by some or the certain contents in set It covers and (is substituted for specific scheduled item), maximize the probability of occurrence for the content being occluded (that is, the content being occluded for i-th Probability of occurrence be it is maximum under conditions of every other content occurs, be mathematically represented as:

x_Δi.It indicates from { x₁, x₂..., x_nIn the content being occluded that randomly selects, that is, the sky in cloze test, Similar to language gap-filling questions.For example, for set of words " I especially _ this it is lovely it is small _, because it _ sprout ", in above formula x_Δ1, x_Δ2, x_Δ3Respectively represent word " liking ", " dog ", " very ".

It is described in detail with reference to the accompanying drawing according to the recommended models generation method of the embodiment of the present application, content recommendation method And its relevant equipment.

Fig. 1 shows the exemplary flow of the recommended models generation method for commending contents according to the embodiment of the present application Figure.The exemplary method can be executed by any calculating equipment with data-handling capacity, which can be service Device is also possible to terminal device.It is realized for example, video website/Video Applications server is recorded according to the video-see of user The exemplary method.In another example terminal device is recorded according to the video-see of the user of the terminal device realizes the exemplary method.Such as Shown in Fig. 1, which includes:

S110 obtains the historical content sequence of user, wherein the historical content sequence is that the user is used more The time series of a content.

The recommendation of dialogue-based (session-based) typically refers to the click or sight for the previous period according to user The historical record seen predicts the content in next moment user possible (including click, purchase or viewing etc.) interested.At this Apply in embodiment, " the used content of user ", which refers to, any content interacted with user, and " use " can wrap It includes but is not limited to click, watch, listen to, download, print or buy, " content/content item " may refer to all, example Such as, video, audio, news, other word contents, commodity, advertisement, etc..

For execute the available user of calculating equipment of the exemplary method historical behavior data (such as video-see note Record) and generate historical content sequence.For example, user (such as inner successively sees in a period of time from 9:00 to morning 12:00) in morning Seen 8 videos: video 1 (video ID is 1), video 2 (video ID is 2), video 3 (video ID is 3) ... 8 (video of video ID is 8), then corresponding video can be represented with video ID, obtains the historical content sequence of user during this period: 1,2, 3,4,5,6,7,8 }.

In this example, corresponding video is represented to generate historical content sequence using video ID, it is to be understood that Other unique identifiers can be used to indicate corresponding video.

One or more contents in the multiple content of the historical content sequence are substituted for predetermined by S120 respectively , to obtain the list entries as training data.

In S120, historical content sequence is handled using cloze test (gap-filling) strategy, to be instructed Practice data.That is, certain elements of historical content sequence are deducted, " sky " in cloze test is formed.It in one example, can be with " sky " is indicated using unified scheduled item.It is, for example, possible to use the items without any practical significance as scheduled item, for example, with The not corresponding item of any content in alternating content set is (for example, generate an ID being different from the ID of any video As scheduled item).

It in one example, can be by predetermined quantity in multiple contents of historical content sequence (such as 5,10, etc.) Or the content of predetermined ratio (such as 30% of historical content sequence all the elements for being included) is substituted for scheduled item respectively.One In a example, the content of scheduled item to be substituted for can be randomly chosen from the multiple content, that is, the content to be replaced Position can not be it is fixed, but it is randomly selected, head, tail portion or any middle position of sequence can be located at.One In a example, at least one of content being replaced is not located at the content of the tail portion of historical content sequence, that is, is replaced by At least one in the content of scheduled item is the content positioned at historical content sequence middle position.For example, for historical content sequence It arranges { 1,2,3,4,5,6,7,8 }, the content being replaced includes 1, at least one of 2,3,4,5,6,7.

In one example, historical content sequence { 1,2,3,4,5,6,7,8 } can be obtained by the processing of S120 To list entriesWhereinIndicate scheduled item, the original positioned at these positions The content item 2,5,7 come is deducted/is covered.

S130 is generated according to the one or more of contents being replaced and is identified as the training data of the training data Output sequence, wherein the output sequence is the time series for the one or more of contents being replaced；

In S130, training data mark corresponding with training data obtained in S120 is generated according to the content being replaced Know, as output sequence.

As described in above in descriptive modelling mode, the recommended method of the embodiment of the present application is to maximize to be occluded (to be replaced Change scheduled item into) content probability of occurrence, i.e., so that the prediction result of recommended models is directed toward the content that is occluded.Namely It says, what is predicted according to the recommended models of the embodiment of the present application is the content for being replaced by scheduled item.Therefore, in S120 Obtained training data, training data mark (i.e. the outputs of recommended models) should be the time series for being replaced content.

For example, for training data above-mentionedIt is generated in S130 Its training data identifies { 2,5,7 }.

By S120 and S130, for a historical content sequence, an available training data and corresponding training Data Identification.Multiple historical content sequences of one or more users can be collected, and each historical content sequence is executed S120 and S130, to obtain multiple training datas and corresponding training data mark, the training data as recommended models Collection.

S140 instructs neural network using multiple training datas and the corresponding training data mark Practice, to obtain the recommended models for commending contents.

In S140, it can be used through the obtained multiple training datas of S110-S130 and corresponding training data mark, Initial model is trained, to obtain the recommended models for commending contents.In one example, neural network can be used Realize the foundation of recommended models.It can be by adjusting the parameter of the neural network, so that the neural network is identified, quilt The probability of occurrence for being substituted for the content of scheduled item is maximum.That is, every time using training data and corresponding training data mark pair When neural network is trained, by adjust neural network system so that neural network input be training data feelings Under condition, the output of neural network is training data mark.For example, for training data The parameter of neural network is adjusted, so that its output is { 2,5,7 }.

As can be seen that neural network is replaced in prediction in the case where the content being replaced is not located at tail of sequence Content when the information in its future (i.e. after the content) must be taken into consideration；It is located at (non-head among sequence in the content being replaced Portion or tail portion) in the case where, neural network not only considers that it goes over the information (i.e. before the content) in prediction, it is also contemplated that The information in its future (i.e. after the content), so that prediction result is more acurrate.

In one example, convolutional neural networks can be used to realize and generate according to the recommended models of the embodiment of the present application Method and content recommendation method.In one example, used convolutional neural networks can be using empty convolution (dilated convolution).Fig. 2 shows for the neural network in the recommended method and modeling method according to the embodiment of the present application Illustrate architecture diagram.

As shown in Fig. 2, the exemplary neural network 200 includes insertion (embedding) layer 210, input layer 220, convolutional layer 230 and output layer 240.They are introduced one by one below.

Embeding layer 210 is mainly used for the higher-dimension one-hot (solely heat) by such as content ID or context (context) information Coding mapping is embedded in (embedding) matrix to low-dimensional.In the embodiment of the present application, embeding layer 210 very flexibly, has extensive Ability can model various context information.In one example, embeding layer 210 mainly includes that content is embedded in (item Embedding) matrix, the insertion vector that every a line of matrix represents a content can for being replaced by the content of scheduled item To indicate scheduled item using an additional insertion vector (i.e. its insertion vector for being different from any content).Meanwhile it being embedded in Layer 210 can make full use of context information, such as portrait (profile) information, time and space information and the social activity of user Friends etc..By taking user draws a portrait information as an example, the insertion vector an of User ID can be initialized first, for each meeting Contextual information (such as user's portrait information) during words, determines that corresponding contextual information is embedded in vector according to User ID, And (concat) mode is spliced by vector, the contextual information is combined to be embedded in vector and original contents insertion vector, as final Content embeded matrix.As shown in Figure 2, E_i(i=0,1 ... 16) represent the original of i+1 content in list entries Content is embedded in vector, E_sRepresent the corresponding insertion vector of scheduled item, E_cRepresent corresponding contextual information insertion vector, such as its Id information, temporal information, the friend information etc. of user can be represented.

In the illustrated example shown in fig. 2, list entries be 0,1, ■, 3,4,5,6, ■, 8, ■, 10,11, ■, 13, ■, 15, ■ }, wherein " ■ " indicates scheduled item, i.e., former historical content sequence be 0,1,2,3,4,5,6,7,8,9,10,11,12, 13,14,15,16 }, wherein content 2,7,9,12,14,16 has been replaced by scheduled item ■, forms the training as list entries Data.As shown in Fig. 2, the output of output layer 240 is the time series for the content being replaced in the case where the list entries, That is { 2,7,9,12,14,16 }.

As shown in Fig. 2, each content that list entries is included, which is embedded into layer 210, is expressed as embeded matrix, show The lower section of corresponding contents ID.As shown in Fig. 2, what embeding layer 210 was converted into is two-dimensional matrix, however convolutional layer 230 using One-dimensional convolution, therefore in convolution process (such as in input layer 220), has done a dimension variation, by embeded matrix it is long, It is wide } be converted into 1, it is long, wide, so that the width dimensions of embeded matrix are convolved network as channel (channel) processing.

It is (i.e. original interior that each content that list entries includes is converted into corresponding content insertion vector at embeding layer 210 Hold insertion vector E_i(i=0,1 ... 16) vector E is embedded in corresponding contextual information_cSplicing or scheduled item insertion Vector E_sVector E is embedded in corresponding contextual information_cSplicing) after, into input layer 220.Neural network 200 is inputting Each content that layer 220 receives list entries is embedded in vector, and carries out convolutional calculation to list entries at convolutional layer 230.

In one example, empty convolutional neural networks can be according to the neural network of the embodiment of the present application 200.Such as figure Shown in 2, convolutional layer 230 includes the first convolutional layer 231, the second convolutional layer 232, third convolutional layer 233 and Volume Four lamination 234. In one example, convolutional layer 230 is using one-dimensional empty convolution modeling.Visible range (receptive can be made using empty convolution Field) with the increase of index rank.As shown in Fig. 2, l is represented empty (dilation), r indicates visible range, from input layer in Fig. 2 220 cavities for being upwardly into Volume Four lamination 234 are designed as { 1,2,4,8,16 }, then, the product from the first convolutional layer 231 to Volume Four Layer 234, corresponding visible range are respectively { 3,7,15,31 }, that is, 2^m- 1, m indicate m convolutional layer.In one example, it rolls up Product filling (padding) mode uses same (identical) mode, and convolution kernel (kernel) size is 3.Example shown in Fig. 2 In, it has been shown as example four layers of convolutional layer.But it is to be understood that convolutional layer 230 may include more or fewer layers.

For example, in another example, in order to obtain bigger visible range, the convolution number of plies can be increased and designed corresponding empty Hole.For example, recommending, if one has viewed 300 short-sighted frequencies in user one day, to repeat in scene in true long session Empty convolution framework in Fig. 2, for example, increasing by two convolutional layers upwards in the structure basis of original Volume Four lamination 234: the Five convolutional layers and the 6th convolutional layer are repeated twice the structure of 220 to the 6th convolutional layer of input layer then up, and using following empty Hole design: { 1,2,4,8,16,32,64,1,2,4,8,16,32,64,1,2,4,8,16,32,64 }, so that uppermost The visible range of convolutional layer can reach 379.

Convolutional layer 230 is up softmax output layer 240.Different from NextitNet, (it calculates all of list entries The corresponding softmax of content) and Caser (its softmax for only calculating the last one content), according to the embodiment of the present application Neural network 200 in, only calculate (the being replaced by scheduled item) content (the namely sky of cloze test) being occluded { 2,7,9,12,14,16 } corresponding softmax in softmax, such as Fig. 2, and according to the softmax of generation and really Training data identifies to calculate cross entropy (cross entropy) loss function.

In Fig. 2, the intersection camber line on right side shows the residual error network design of neural network 200.Implemented according to the application DenseNet (the Densely connected in image procossing is utilized in the residual error block structure of the neural network 200 of example Convolutional networks, the convolutional network intensively connected) technology.Fig. 3 A and B respectively illustrate NextitNet algorithm In and neural network according to the embodiment of the present application residual block design schematic diagram, wherein 1 × 1 and 1 × 3 indicates to roll up Product core size (kernel size).As shown in fig. 3, the residual block design comparison of NextitNet is simple, i.e. each cavity Convolutional layer only receives the output of one layer of front as input.And it is set according to the connection of the residual block of the neural network of the embodiment of the present application Count more crypto set, as shown in the camber line in Fig. 3 B, interconnect all empty convolutional layers, the front layer that each layer can all receive it is defeated Out as input.

In addition, Element-Level add operation (element-wise addition) is used when residual error operation in NextitNet, And residual error operation is then splicing (concatation) operation used in the embodiment of the present application, i.e., convolutional neural networks is every A convolutional layer is configured as: vector splicing will be carried out from the received residual error input of other convolutional layers and the input of the convolutional layer, it will Total input of the obtained vector as the convolutional layer.

In the neural network according to the embodiment of the present application, it is similar to NextitNet, nerve is increased using empty convolution The visible range of network, but unlike that NextitNet, the convolutional neural networks of the embodiment of the present application be bilateral network (that is, Can see in the past but also see future), and NextitNet network is unilateral network (can only see over).In the application reality It applies in example, using (gap-filling test) strategy of filling a vacancy in natural language, using random sampling strategy, according to certain (usual 30% or so) deducts the content item for including in sequence for each historical content sequence data to ratio at random, and using system One symbol or any one do not have practical significance ID substitute, then in the last layer of neural network (output layer) to quilt It is predicted the position of deduction.For example, the dialogue-based historical content sequence of user be 1,2,3,4,5,6,7,8,9,10, 11,12,13,14,15,16 }, using gap-filling (cloze test) strategy deducted and replaced at random after, generation it is new Sequence be Wherein 2,7,9,12,14, 16 } to need the content ID predicted,For the sky (the i.e. described scheduled item) of cloze test.

It, can be to avoid leakage of information (neural network cheating) problem by using gap-filling strategy.? In NextitNet algorithm, each layer of convolution process can only see past information (for example, if prediction 4, network can To see { 1,2,3 }).And in the neural network of the embodiment of the present application (such as neural network 200 as shown in Figure 2), do not having In the case where having gap-filling, network is it can be seen that the data of entire sequence (still use previous example, if prediction 4, then network can see { 1,2,3,5,6 ..., 16 }), therefore in the training process, neural network can very easily The information to be predicted is seen, without being concerned about context (context) information.This has resulted in invalid training or information Leakage (or cheating), numerical value close to 0, but gap- can be dropped to easily by showing as loss function in the training process Filling is that do not have any generalization ability completely but in final forecast period.In the neural network according to the embodiment of the present application After the middle design using gap-filling, network training process must be by means of contextual information, including past information and not The information come, but since content itself is occluded, so the case where there is no leakage of information.

It is not to all deducted contents all with meaningless predetermined in another embodiment according to the application Item replacement, but replaced sometimes with a randomly selected content.The inventors of the present application found that by this way Add random noise, the prediction effects of obtained recommended models relative to completely with scheduled item replace having it is certain by a small margin It is promoted.For example, in the training process, deducted content is substituted using scheduled item the case where 80%, 10% the case where uses one A randomly selected content ID substitution, 10% the case where, use the content itself (not deducting).

In each embodiment of method for generating recommended models as described above, by by the historical content sequence packet of user The one or more contents contained replace with specific scheduled item, so that these content items being replaced are occluded, will handle in this way Historical content sequence afterwards is identified as training data and using the content being replaced as corresponding training data, to neural network It is trained, so as to generate accurate recommended models in the case where carrying out data promotion without subsequence.In addition, The partial content item in training data is covered and (replaces with scheduled item) when due to training, so that neural network must in training Context (context) information for being occluded item must be considered, to not only consider that this is occluded the past information of item, it is also considered that The information in its future so that recommended models are more accurate, while avoiding the cheating of neural network caused by information leakage.In this Shen In some embodiments please, according to the bi-directional data Promotion Strategy, the empty convolutional neural networks frame for being adapted to the strategy is devised Structure, to obtain accurate recommended models by being trained to the neural network.In addition, in some embodiments of the present application In, the residual block of neural network is designed as the output of the front layer that each layer can all receive as input, so that obtained recommendation Model and its prediction result are more accurate.

It can be used in content recommendation method according to the recommended models that embodiment as above obtains, such as dialogue-based content In recommended method.Fig. 4 shows the schematic flow diagram of the content recommendation method according to one embodiment of the application, and this method can be by Any calculating equipment with data-handling capacity executes, which can be server, be also possible to terminal device. For example, video website/Video Applications server is recorded according to the video-see of user realizes the exemplary method.In another example eventually End equipment records according to the video-see of the user of the terminal device and realizes the exemplary method.The calculating equipment is provided with according to such as The recommended models that the upper recommended models generation method generates.As shown in figure 4, the exemplary method includes:

S410 obtains the historical content sequence of user, wherein the historical content sequence is that the user is used more The time series of a content.

Calculating equipment for executing the exemplary method can be as described in the S110 according to user's history behavior number According to the used multiple contents of user are known, to obtain the historical content sequence of user.For the details of the processing, refer to Step S110, details are not described herein.

For example, for user A, know its historical content sequence be 0,1,2,3,4,5,6,7,8,9,10,11,12,13, 14,15 }.

S420 adds a scheduled item as content to be solved in the tail portion of the historical content sequence, obtains list entries.

For the historical content sequence obtained in S410, pass through recommended models generation method as described above to utilize Obtained from recommended models predict next content that user A will will use, in the tail of the historical content sequence in S420 Add list entries of the scheduled item as content to be solved, as recommended models to be entered in portion.This is because such as above Described in model generating method embodiment, the recommended models are to being occluded in list entries and (be replaced by scheduled item) Content is predicted, therefore, in order to solve user in subsequent time content to be used, is needed in the last of historical content sequence The content being occluded, i.e. scheduled item are added after one content.

Description in relation to scheduled item refers to each embodiment of recommended models generation method as described above.Recommend mould generating Any scheduled item is used when type, and identical scheduled item is just used when being predicted using the recommended models.

For example, for the historical content sequence { 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15 } of user A, By the processing of S420, available corresponding list entriesWhereinFor scheduled item.

The list entries is inputted trained recommended models, to obtain the output as the recommended models by S430 Content to be recommended indicate to recommend to the user wherein the content to be recommended is corresponding with the content to be solved Content.

In S430 will in list entries input calculating equipment obtained in S420 trained recommended models, obtain The content predicted to the recommended models.By the generating process of recommended models it is found that recommended models are for the content being occluded It is predicted.In list entries, the content being occluded is the content of the subsequent time of historical content, i.e., future time is interior Hold, therefore, it is recommended that model prediction is user in subsequent time by content to be used.By recommended models, can predict User recommends user in subsequent time maximum probability content to be used, using the content as recommendation.

It, can be by using to the content being occluded in list entries by content recommendation method embodiment as described above The recommended models predicted predict the content to be recommended to user, can obtain the higher prediction result of accuracy.

It can be used for according to the recommended models generation method and content recommendation method of the embodiment of the present application dialogue-based Recommend scene.For example, (can be 10 minutes in the time of a session according to user, being also possible to 1 hour, be also possible to 1 It, or more long) in watch/click/and the behaviors such as buy, then can utilize basis according to the historical behavior data of user The content recommendation method of the embodiment of the present application come predict future customer may (such as viewing/click/purchase etc.) interested it is interior Hold.By taking the short-sighted frequency in certain Video Applications as an example, user A within the time of a whole morning effectively viewing (such as broadcasting rate 80% with On be considered as effective viewing) 300 videos.It is possible to watched using the recommended method of the embodiment of the present application according to user 300 videos, prediction user's future possible interested video, to reach personalized recommendation effect.

Inspiration of the present inventor by cloze test in language testing (gap-filling) thought, proposes one The completely new data hoisting way of kind, can not only be using user in session data in past click behavior, and can utilize Following click behavior.In some embodiments, for the data Promotion Strategy, a kind of two-way convolutional neural networks frame is devised Structure, wherein by network design to increase visible range using empty convolution, while DenseNet residual error structure is two-way applied to this Convolutional neural networks.By experiment, find the content recommendation method of the embodiment of the present application relative to current newest convolutional network With highly stable and important promotion.

Top-N sequence (sequence of top n) index (for example, NDCG, MRR etc.) is generallyd use to evaluate and test pushing away for recommended method Recommend effect.By taking application 1 is with the session data of application 2 as an example, experimental result is as shown in the following table 1 and 2:

Table 1: for using 1 (music libraries: 13.7 ten thousand, the ten thousand) data of number of sessions 97

Table 2: for using 2 (video library: 6.9 ten thousand, the ten thousand) data of number of sessions 50

Measurement	MRR@5	Recall@5	NDCG@5	MRR@20	Recall@20	NDCG@20
							Caser	0.0122	0.0243	0.0152	0.0162	0.0670	0.0271
NextitNet	0.0119	0.0238	0.0148	0.0158	0.0670	0.0268
							The embodiment of the present application	0.0138	0.0281	0.0173	0.0184	0.0784	0.0313
Promotion degree	13.1%	15.6%	13.8%	13.6%	17.0%	15.5%

Experimental setup is as follows:

Length using the session of 1 each user is 20.For the session data less than 20, using filling mode in sequence Beginning filling 0 is arranged, until sequence length reaches 20.

Length of session using 2 each user is 30, and filling mode is identical as application 1.Different from 1 data of application Be, it should be pointed out that for apply 1 and 2 data, attempted different length of session, from 10 to 100, as a result with it is above Two tables have similar promotion amplitude.

All models are all made of Adam as optimizer in experiment, and NexitNet recommendation is all made of 64 and is used as batch size (batch processing size).Experiment discovery NexitNet is insensitive to batch size, and usual optimal performance is in the range of 32-64. For the recommended method of the embodiment of the present application, biggish batch size, such as 1024 can be used, such fast convergence rate, And effect is often more excellent.Learning rate (learning rate) uses 0.001, and the 80% of collected training data is as instruction Practice the training data of recommended models, 3% is used as test set, and 17% as verifying set.As described above, NextItNet is using figure The residual error block structure of 3A compares for testing.Experimental Hardware environment uses GPU Tesla P40 and tensorflow version 1.7.0。

According to the another aspect of the embodiment of the present application, a kind of recommended models generating means for commending contents are also provided, For executing each embodiment of recommended models generation method as described above.Fig. 5 is shown according to the exemplary implementation of the application one The signal composition block diagram of such device 500 of example, as shown in figure 5, the device 500 includes:

Module 510 is obtained, is configured as: obtaining the historical content sequence of user, wherein the historical content sequence is The time series of the used multiple contents of user；

Generation module 520 is inputted, is configured as: one in the multiple content for including by the historical content sequence A or multiple contents are substituted for scheduled item respectively, to obtain the list entries as training data；

Generation module 530 is exported, is configured as: being generated according to the one or more of contents being replaced described in being used as The output sequence of the training data mark of training data, wherein the output sequence is the one or more of contents being replaced Time series；

Training module 540, is configured as: being identified using multiple training datas and the corresponding training data Neural network is trained, to obtain the recommended models for commending contents.

According to the another aspect of the embodiment of the present application, a kind of content recommendation device is also provided, it is as described above for executing Each embodiment of recommended models generation method.Fig. 6 is shown according to such device 600 of one exemplary embodiment of the application Illustrate composition block diagram, as shown in fig. 6, the device 600 includes:

Module 610 is obtained, is configured as: obtaining the historical content sequence of user, wherein the historical content sequence is The time series of the used multiple contents of user；

Input generation module 620, be configured as: the historical content sequence tail portion add a scheduled item be used as to Content is solved, list entries is obtained；

Recommend generation module 630, be configured as: the list entries being inputted into trained recommended models, to obtain The content to be recommended of output as the recommended models, wherein the content to be recommended is corresponding with the content to be solved, Indicate the content to recommend to the user.

Each unit/module function and the realization process and correlative detail of effect are specifically detailed in above-mentioned each device The realization process that step is corresponded in embodiment of the method is stated, details are not described herein.

Each Installation practice in the above various embodiments can be by way of hardware, software, firmware or combinations thereof come real It is existing, and which can be implemented as an individual device, also may be implemented as each composition units/modules be dispersed in one or The logic integrated system of corresponding function is executed in multiple calculating equipment and respectively.Each list of each device is formed in the above various embodiments Member/module is divided according to logic function, they can be repartitioned according to logic function, such as can be by more Or less units/modules realize the device.These component units/module respectively can by hardware, software, firmware or its Combined mode realizes that they can be the individual components of difference, be also possible to multiple components combine execute it is corresponding Integrated unit/module of logic function.The mode of the hardware, software, firmware or combinations thereof may include: the hardware group of separation Part, the functional module realized by programming mode, the functional module realized by programmable logic device, etc., or more than The combination of mode.

According to an exemplary embodiment, each of above-mentioned each Installation practice can be implemented as a kind of calculating equipment, The calculating equipment includes memory and processor, computer program is stored in the memory, the computer program is in quilt When the processor executes, so that the calculating equipment executes in each embodiment of localization method or scaling method as described above Either one or two of, alternatively, the computer program realizes the calculating equipment as described above The function that component units/module of each Installation practice is realized.

Processor described in above embodiment can refer to single processing unit, such as central processing unit CPU, can also To be the processing unit/processor distributed processor system for including multiple dispersions.

Memory described in above embodiment may include one or more memories, can be and calculates equipment Internal storage, such as transient state or non-transient various memories, are also possible to be connected to calculating equipment by memory interface External memory.

Fig. 7 shows the signal composition block diagram of such exemplary embodiment for calculating equipment 701.As shown in fig. 7, The calculating equipment can include but is not limited to: at least one processing unit 710, connects not homology at least one storage unit 720 The bus 730 of system component (including storage unit 720 and processing unit 710).

The storage unit is stored with program code, and said program code can be executed by the processing unit 710, so that The processing unit 710 executes described in the description section of this specification above-mentioned example method according to the various examples of the application The step of property embodiment.For example, the processing unit 710 can execute each step shown in the drawings.

Storage unit 720 may include the readable medium of volatile memory cell form, such as Random Access Storage Unit (RAM) 721 and/or cache memory unit 722, it can further include read-only memory unit (ROM) 723.

Storage unit 720 can also include program/utility 724 with one group of (at least one) program module 725, Such program module 725 includes but is not limited to: operating system, one or more application program, other program modules and It may include the realization of network environment in program data, each of these examples or certain combination.

Bus 730 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.

The calculating equipment can also be with one or more external equipments 770 (such as keyboard, sensing equipment, bluetooth equipment etc.) Communication, can also be enabled a user to one or more equipment interact with the calculating equipment communicate, and/or with make the calculating Any equipment (such as router, modem etc.) that equipment can be communicated with one or more of the other calculating equipment is logical Letter.This communication can be carried out by input/output (I/O) interface 750.In one embodiment, which can be with By network adapter 760 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.As shown, network adapter 760 is communicated by bus 730 with other modules of the calculating equipment. It should be understood that although not shown in the drawings, but the calculating equipment other hardware and/or software module can be used to realize, including But it is not limited to: microcode, device driver, redundant processing unit, external disk drive array, disk array (Redundant Arrays of Independent Drives, RAID) system, tape drive and data backup storage system etc..

Through the above description of the embodiments, those skilled in the art is it can be readily appreciated that example described herein is implemented Mode can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to the application The technical solution of embodiment can be embodied in the form of software products, which can store non-volatile at one Property storage medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) in or network on, including some instructions are so that a calculating Equipment (can be personal computer, server, terminal installation or network equipment etc.) is executed according to the application embodiment Method.

In the exemplary embodiment of the application, a kind of computer readable storage medium is additionally provided, is stored thereon with meter Calculation machine program makes computer execute above method embodiment portion when the computer program is executed by the processor of computer Divide each method embodiment of description.

According to one embodiment of the application, a kind of journey for realizing the method in above method embodiment is additionally provided Sequence product, can be using portable compact disc read only memory (CD-ROM) and including program code, and can set in terminal It is standby, such as run on PC.However, the program product of the embodiment of the present application is without being limited thereto, and in this document, readable storage Medium can be any tangible medium for including or store program, which can be commanded execution system, device or device Using or it is in connection.

Described program product can be using any combination of one or more readable mediums.Readable medium can be readable letter Number medium or readable storage medium storing program for executing.Readable storage medium storing program for executing for example can be but be not limited to electricity, magnetic, optical, electromagnetic, infrared ray or System, device or the device of semiconductor, or any above combination.The more specific example of readable storage medium storing program for executing is (non exhaustive List) include: electrical connection with one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only Memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), light storage device, magnetic memory device or above-mentioned any appropriate combination.

Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, In carry readable computer program.The data-signal of this propagation can take various forms, and including but not limited to electromagnetism is believed Number, optical signal or above-mentioned any appropriate combination.Readable signal medium can also be other than readable storage medium storing program for executing it is any can Read medium, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or Program in connection.

The computer program for including on readable medium can transmit with any suitable medium, including but not limited to wirelessly, Wired, optical cable, radio frequency (Radio Frequency, RF) etc. or above-mentioned any appropriate combination.

The behaviour for executing the embodiment of the present application can be write with any combination of one or more programming languages The program code of work, described program design language include object oriented program language-Java, C++ etc., are also wrapped Include conventional procedural programming language-such as C or similar programming language.Program code can fully with Family, which calculates, to be executed in equipment, partly executes on a user device, executing as an independent software package, partially in user's meter Upper side point is calculated to execute or execute in remote computing device or server completely on a remote computing.It is being related to In the situation of remote computing device, remote computing device can pass through the network of any kind, including local area network (LAN) or wide area Net (WAN), is connected to user calculating equipment, or, it may be connected to external computing device (such as provided using Internet service Quotient is connected by internet).

Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to its of the application Its embodiment.This application is intended to cover any variations, uses, or adaptations of the embodiment of the present application, these modifications, Purposes or adaptive change follow the general principle of the embodiment of the present application and including the undocumented the art of the application In common knowledge or conventional techniques.The description and examples are only to be considered as illustrative, the true scope of the application and Spirit is pointed out by the attached claims.

Claims

1. a kind of recommended models generation method for commending contents characterized by comprising

Obtain user historical content sequence, wherein the historical content sequence be the used multiple contents of the user when Between sequence；

One or more contents in the multiple content that the historical content sequence includes are substituted for scheduled item respectively, with Obtain the list entries as training data；

The output sequence identified as the training data of the training data is generated according to the one or more of contents being replaced Column, wherein the output sequence is the time series for the one or more of contents being replaced；

Neural network is trained using multiple training datas and the corresponding training data mark, to be used In the recommended models of commending contents.

2. the method according to claim 1, wherein the multiple content for including by the historical content sequence In one or more contents be substituted for scheduled item respectively and include:

The content of predetermined quantity or predetermined ratio in the multiple content is substituted for scheduled item respectively.

3. the method according to claim 1, wherein the multiple content for including by the historical content sequence In one or more contents be substituted for scheduled item respectively and include:

One or more of contents are randomly chosen from the multiple content to be substituted for scheduled item respectively.

4. the method according to claim 1, wherein the scheduled item is at least one of following:

The senses of a dictionary entry unintentionally not corresponding with all the elements；Or

Randomly selected content.

5. the method according to claim 1, wherein the neural network is convolutional neural networks.

6. according to the method described in claim 5, it is characterized in that, the convolutional neural networks include multiple convolutional layers, wherein The output of each convolutional layer is connected respectively to each convolutional layer on the convolutional layer, as each convolution on the convolutional layer The residual error input of layer.

7. according to the method described in claim 5, it is characterized in that, the convolutional neural networks are empty convolutional neural networks.

8. according to the method described in claim 6, it is characterized in that, each convolutional layer of the convolutional neural networks is configured Are as follows: the input of received residual error and the input of the convolutional layer are subjected to vector splicing, using obtained vector as the total of the convolutional layer Input.

9. method according to claim 1 to 8, which is characterized in that it is described be replaced it is one or more of At least one in content is not located at the content of the tail portion of the historical content sequence.

10. method according to claim 1 to 8, which is characterized in that using multiple training datas and The corresponding training data mark, which is trained neural network, includes:

By adjusting the parameter of the neural network, so that the neural network is identified, it is one or more to be replaced The probability of occurrence of a content is maximum.

11. a kind of content recommendation method characterized by comprising

The list entries is inputted into trained recommended models, it is to be recommended interior to obtain as the output of the recommended models Hold, wherein the content to be recommended is corresponding with the content to be solved, indicates the content to recommend to the user.

12. according to the method for claim 11, which is characterized in that the recommended models are to appoint in -10 according to claim 1 It is generated described in one for the recommended models generation method of commending contents.

13. a kind of recommended models generating means for commending contents characterized by comprising

Module is obtained, is configured as: obtaining the historical content sequence of user, wherein the historical content sequence is the user The time series of used multiple contents；

Generation module is inputted, is configured as: one or more in the multiple content for including by the historical content sequence A content is substituted for scheduled item respectively, to obtain the list entries as training data；

Generation module is exported, is configured as: being generated according to the one or more of contents being replaced and is used as the trained number According to training data mark output sequence, wherein the output sequence is the time for the one or more of contents being replaced Sequence；

Training module is configured as: using multiple training datas and the corresponding training data mark to nerve Network is trained, to obtain the recommended models for commending contents.

14. a kind of calculating equipment, including processor and memory, computer program, the place are stored on the memory Reason device is configured as realizing when executing the computer program on the memory according to claim 1 to described in any one of 10 Method or realize method described in any one of 1-12 according to claim 1.

15. a kind of computer readable storage medium, is stored thereon with computer program, the computer program is held by processor Method according to any one of claim 1 to 10 is realized when row or realizes any one of 1-12 institute according to claim 1 The method stated.