CN110309427B

CN110309427B - Object recommendation method and device and storage medium

Info

Publication number: CN110309427B
Application number: CN201810553549.1A
Authority: CN
Inventors: 丘志杰; 饶君; 张博; 林乐宇; 冯喆; 陈磊; 胡澜涛; 刘书凯; 刘毅; 孙振龙; 王良栋
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2018-05-31
Filing date: 2018-05-31
Publication date: 2023-03-10
Anticipated expiration: 2038-05-31
Also published as: CN110309427A

Abstract

The embodiment of the invention provides an object recommendation method, an object recommendation device, a storage medium and computer equipment. Therefore, based on the characteristics of the cyclic neural network in coding calculation, the model coding mode of the embodiment gives consideration to the long-term historical interest and the short-term historical interest of the user, and considers the access sequence of the user for accessing the object on the application platform, so that the interest transition and the interest accumulation of the user can be more accurately positioned, and the problem that the diversity and the personalized loss of the obtained recommended object are caused by the conventional ItemCF object recommendation method is solved.

Description

Object recommendation method and device and storage medium

Technical Field

The invention relates to the technical field of data processing, in particular to an object recommendation method, an object recommendation device and a storage medium.

Background

Nowadays, the popularization of the internet brings a great amount of information to users, and the requirements of the users on the information in the information age are met, but with the rapid development of the network, the amount of network information is greatly increased, and when the users face a great amount of information, the users are difficult to obtain the information which is really useful for the users, so that the use efficiency of the information is reduced on the contrary. For this reason, technicians have proposed recommendation systems, that is, information, products, and the like, which are interesting to users, are recommended to users according to information needs, interests, and the like of the users, so as to implement personalized information recommendation, and the recommendation systems are widely applied to many fields, such as news recommendation, business recommendation, entertainment recommendation, learning recommendation, life recommendation, and the like.

At present, a commonly used recommendation method mainly includes a recommendation method based on Collaborative Filtering (CF), that is, an object accessed by a user is obtained by calculating Item-Item (Item may be an object such as an article or a video) similarity, and K items most similar to the Item are taken as recommendation objects.

The Item-Item similarity calculation is usually realized by adopting an Item CF method, namely, the similarity of two items is calculated by utilizing the co-occurrence relation of the items in the access object sequence of a user. When different users have the same access object Item in a period of time but different access sequences of the access objects, the recommendation objects of the users obtained by the Item CF-based recommendation method are the same, personalized recommendation cannot be realized, the recommendation accuracy is also affected, and the obtained recommendation objects cannot give consideration to long-term interest and short-term interest of the users.

Disclosure of Invention

The embodiment of the invention provides an object recommendation method, an object recommendation device, a storage medium and computer equipment, which can realize accurate positioning of interest transition and interest accumulation of a user, meet personalized recommendation requirements of different users, give consideration to long-term interest and short-term interest of the user to the obtained recommended object, and improve the accuracy of recommending the object to the user.

In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:

a method of object recommendation, the method comprising:

acquiring a user access sequence, wherein the user access sequence is generated based on an object output by a user access application platform;

inputting the user access sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object, wherein the recommended object prediction model is obtained by training user access sequences corresponding to a plurality of sample users on the basis of a recurrent neural network;

similarity calculation is carried out on the coding vector and each candidate word vector;

and obtaining a recommended object of the user based on the similarity calculation result.

An object recommendation device, the device comprising:

the system comprises a sequence acquisition module, a sequence acquisition module and a sequence generation module, wherein the sequence acquisition module is used for acquiring a user access sequence, and the user access sequence is generated based on an object output by a user access application platform;

the coding calculation model is used for inputting the user access sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object, and the recommended object prediction model is obtained by training user access sequences corresponding to a plurality of sample users on the basis of a recurrent neural network;

the first similarity calculation module is used for calculating the similarity of the coded vector and each candidate word vector;

and the recommended object selection model is used for obtaining the recommended object of the user based on the similarity calculation result.

A storage medium having stored thereon a computer program for execution by a processor to implement the steps of the subject method as described above.

A computer device, the computer device comprising:

a communication interface;

a memory for storing a computer program for implementing the object method described above;

a processor for recording and executing a computer program stored by the memory, the computer program for implementing the steps of:

and obtaining the recommended object of the user based on the similarity calculation result.

Based on the foregoing technical solutions, embodiments of the present invention provide an object recommendation method, apparatus, storage medium, and computer device, where the embodiment is based on a recurrent neural network, and trains user access sequences corresponding to a plurality of sample users to obtain a recommendation object prediction model, performs coding calculation on the user access sequences, and then performs similarity calculation on the obtained coding vectors and candidate word vectors to obtain a recommendation object of the user. Therefore, based on the characteristics of the cyclic neural network in coding calculation, the model coding mode of the embodiment gives consideration to the long-term historical interest and the short-term historical interest of the user, and considers the access sequence of the user for accessing the object on the application platform, so that the interest transition and the interest accumulation of the user can be more accurately positioned, and the problems of diversity and personalized loss of the obtained recommended object caused by the conventional Item CF recommendation method are solved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

FIG. 1 is a schematic diagram of a network structure of each GRU layer in a recurrent neural network;

fig. 2 is a schematic flowchart of an object recommendation method according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a network architecture of a prediction model of a recommended object according to an embodiment of the present invention;

fig. 4 is a schematic flowchart of another object recommendation method according to an embodiment of the present invention;

fig. 5 is a schematic flowchart of another object recommendation method according to an embodiment of the present invention;

FIG. 6 is a flowchart illustrating another object recommendation method according to an embodiment of the present invention;

fig. 7 is a schematic diagram of a keyword sequence generation method according to an embodiment of the present invention;

FIG. 8 is a flowchart illustrating another object recommendation method according to an embodiment of the present invention;

FIG. 9 is a flowchart illustrating another object recommendation method according to an embodiment of the present invention;

FIG. 10 is a flowchart illustrating a method for recommending an object according to another embodiment of the present invention;

fig. 11 is a schematic application flow diagram of an object recommendation method according to an embodiment of the present invention;

fig. 12 is a schematic structural diagram of an object recommendation apparatus according to an embodiment of the present invention;

fig. 13 is a schematic structural diagram of another object recommendation device according to an embodiment of the present invention;

FIG. 14 is a schematic structural diagram of another object recommendation apparatus according to an embodiment of the present invention;

fig. 15 is a schematic structural diagram of another object recommendation apparatus according to an embodiment of the present invention;

fig. 16 is a schematic hardware structure diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The inventor of the present invention found out that: when a user A reads articles [ X, Y, Z ] in a certain time, an article read by a user B is [ Y, Z, X ], X, Y and Z can be article IDs of the articles read by the user, but are not limited to the article IDs, when respective recommended articles of the two users are obtained, the recommended objects obtained based on similarity calculation results are the same mainly because an object recommendation method based on Item CF only considers the contents of historical read articles and does not consider the reading sequence of the articles read by the user, and respective interest transition and interest accumulation of the two users cannot be accurately reflected.

Moreover, the inventor also finds that when the similarity calculation is carried out, the similarity calculation is difficult to control how many candidate objects are used, and if the candidate objects in the latest period are selected for carrying out the similarity calculation, the obtained recommended objects can only represent the short-term interests of the user; if candidate objects with a long history are selected for similarity calculation, the calculation amount is large, the obtained recommended objects are very many, the obtained massive recommended objects need to be sorted and screened by using a sorting algorithm, the process is complex, and the efficiency of obtaining the recommended objects of the user is affected.

Based on the analysis, the inventor provides a new object recommendation method, which not only considers the access sequence of the user accessing the object (namely reading the object, watching the video and the like) on the application platform and more accurately positions the interest transition and interest accumulation of the user, but also considers the long-term historical interest and the short-term historical interest of the user, improves the accuracy and efficiency of the recommendation result of the user, and meets the personalized recommendation requirements of different users.

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

For convenience of understanding the recommendation method provided in this embodiment, the principle of a Recurrent Neural Network (RNN) related to this embodiment is briefly described here. The RNN is an artificial neural network with nodes directionally connected into a ring, the internal state of the RNN can show dynamic time sequence behaviors, and the RNN is different from a feedforward neural network in that the multi-layer feedback RNN can process input sequences with any time sequence by using internal memory, so that the RNN can more easily process non-segmented handwriting recognition, speech recognition and the like.

The RNN, as a neural network capable of processing variable-length data, can encode history information of any length into a hidden layer (hidden layer), that is, an intermediate output of the neural network, and represents some implicit expression form of an input, usually a vector or a matrix. The RNN can reduce the dimension of the high latitude data by compression as shown in the following equation (1). Particularly, with the widespread application of LSTM (Long Short-Term Memory) and GRU (Gated-current Unit) in RNN in recent years, RNN has been successfully used to solve the technical problems of Natural Language processing NLP (Natural Language processing), such as machine translation, sequence prediction, speech signal processing, etc.

h _t ＝g(Wx _t +Uh _t-1 ) (1)

In the formula (1), x _t Can represent the currently input token vector (in this embodiment, the token vector may be an embedding vector of the access object or the keyword in the access object, and this embodiment may also denote this embedding vector as a word vector), h _t-1 H can represent the initial time h of the hidden layer output at the last time _t-1 For a zero vector, W and U represent the mapping matrices, respectively. In practical applications, the calculation formula may be used to perform calculation and fusion on the current input and the history information to obtain a new encoding vector (in this embodiment, the encoding vector may be an encoding vector).

Therefore, in the application of RNN, at each moment, effective encoding calculation is performed on the history information and the current input to obtain a new data expression vector, i.e., an encoding vector. In this embodiment, the history information may be history access data (e.g., data generated by reading an article, watching a video, etc.) generated by a user accessing an object output by the application platform, and the current input may be a word vector of the access object at the current time or a word vector of a keyword of the access object, etc.

In the coding calculation process of the hidden layer of the recurrent neural network, the above formula shows that the output of the previous moment is added with the input of the current moment, the output of the next moment is obtained through an activation function such as a tanh function, and so on, and finally the output is the coding vector required by the invention.

Since the conventional RNN does not consider that as the number of layers of deployment increases, the depth of the network becomes very deep, resulting in anomalies such as gradient dispersion and gradient explosion occurring in the reverse gradient propagation. In order to solve this problem, the recurrent neural network used in this embodiment may be a modified recurrent neural network obtained by adding a GRU or LSTM model to a conventional RNN, where the recurrent neural network in this embodiment usually includes multiple GRU layers or multiple LSTM layers, and a specific network structure of this embodiment is not described in detail. The present embodiment only takes a recurrent neural network including a plurality of GRU layers as an example to describe the calculation process for obtaining the coding vector, and the present embodiment will not be described in detail herein with respect to the principle of GRU.

When calculating the hidden layer by using GRU, as shown in fig. 1, the hidden layer output at the previous time can be considered as a part of the linear weighted combination of the current hidden layer outputs, specifically referring to equation (2):

in the formula (2), the first and second groups,

the candidate hidden output at the current (namely, at the time t), namely, the intermediate output of the current hidden layer, needs to be subjected to weighted fusion with the hidden layer output at the previous time to obtain the final hidden layer output; weighting factor Z _t H can be output by the hidden layer at the last moment _t-1 And the current input x _t And (3) carrying out automatic adaptation calculation to obtain a calculation formula as follows:

z _t ＝σ(W _z x _t +U _z h _t-1 ) (3)

it can be seen that with the last moment the hidden layer output h is output _t-1 And change of current candidate hidden layer input, weighting factorSon Z _t (i.e., the update gate of the GRU) may also be changed, but the weighting factor Z _t Is finally mapped to (0,1)]Within the interval, Z _t The larger the Z is, the more the current input information is weighted, and the higher the weight is given to the Z _t The specific numerical values of (A) are not limited. In the GUR, U and W in the formula (3) are usually small, and σ may be a coefficient factor, such as the second σ on the left side in fig. 1, and the present embodiment does not limit the specific values of U, W and σ.

In addition, the hidden output for the current candidate in the above formula (2)

The method can be calculated by using an activation function, such as a hyperbolic tangent function tanh, which can retain the originally input symbol information and size information, and the specific calculation formula (4) is as follows:

in the formula (4) r _t The gate can be reset and Z in the above formula _t There may be an update gate, i.e. two gates in the GRU model, and the reset gate may be calculated in the manner shown in equation (5), but is not limited thereto.

r _t ＝σ(W _r x _t +U _r h _t-1 ) (5)

σ in equation (5) may be a coefficient factor, and the specific value is not limited as shown in the first σ on the left side in fig. 1. In addition, the same letters in the above embodiments have the same meaning, and the explanation of the corresponding portions in the above formula (1) can be referred to.

In connection with the description of equations (2) to (5) above, and the structure of the GRU shown in fig. 1, it comprises two gates, namely the reset gate r _t And a refresh door Z _t By resetting the gate r _t Multiplying by the output h of the hidden layer at the previous moment _t-1 See if or how much to reset, then, and the input x at the current time _t Splicing, by laserThe active function tanh is operated to obtain an implicit variable h _t Then, the output h of the previous time is outputted _t-1 With the hidden variable h _t Performing linear combination to obtain the output of the current moment, and repeating the steps to obtain the coding vector required by prediction, wherein the sum of the weights of the two performing linear combination is 1, and the hidden variable h _t The weight of (2) is the output of the update gate, and represents how strong the update is.

In the practical application of this embodiment, before obtaining the recommended object of the current user, an object recommendation prediction model needs to be obtained through training, and by combining the above description of the recurrent neural network principle, it can be known that the recurrent neural network can consider the input time sequence when performing coding calculation on the input, thereby satisfying the personalized requirements on the prediction results of the user, and the model input realizes the ranking of the output results during the coding period no matter whether the model is longer or shorter data, so that the obtained recommendation results can take into account both the long-term historical interest and the short-term historical interest of the user.

Based on this, in this embodiment, user access sequences corresponding to multiple sample users may be obtained, and based on the recurrent neural network, model training is performed on these user access sequences to obtain a recommendation object prediction model.

The user access sequence of the sample user may be formed by an object identifier of an access object of the sample user on the application platform, for example, according to an access time sequence, the user access sequence is generated by an access object ID extracted from historical access data, at this time, the user access sequence may be referred to as an access object sequence, and this embodiment may use the access object sequence as training data to implement model training.

In the process of continuously training and optimizing the model, the difference value between the model prediction result (such as a recommended object) obtained at this time and an actual access object meets a preset condition, namely the model obtained at this time meets a constraint condition, and the model obtained at this time can be used as the recommended object prediction model. However, the content of the constraint condition is not limited, and the number of optimization times or the number of pieces of training data may be limited in this embodiment to control the training optimization process of the model, and the implementation method of how to obtain the recommended object prediction model is not limited in this embodiment.

As an optional embodiment, in order to improve the real-time performance of the recommendation result, in the process of acquiring the training data, the present embodiment may acquire a sequence in each session of each user as the training data. The session information can be a time window from the current interface of the user to the next refreshing, the time difference of the internal information is usually small, model training is carried out by using the training data, and the predicted recommendation object can obtain the feedback of the user in a shorter time interval by using the obtained recommendation object prediction model, so that the real-time performance of the recommendation result is ensured, and the short-term interest of the user is easier to grasp. According to the conception, the embodiment can expand the historical length of the training data and acquire the long-term interest of the user.

Further, in order to enrich the training data, the present embodiment may configure a plurality of pieces of training data for the historical access data of one sample user. Assuming that the object identifier of the object is accessed by the user on the application platform, the obtained access object sequence is [ x1, x2, x3, x4, x5, x6], and the following pieces of training data may be formed in this embodiment: ([ x1], x 2), ([ x1, x2], x 3), ([ x1, x2, x3], x 4), ([ x1, x2, x3, x4], x 5), and ([ x1, x2, x3, x4, x5], x 6), where the left side of the parenthesis is the currently known access object sequence and the right side of the parenthesis is the target object to be predicted. In the process of performing model training, the multiple pieces of training data may be used to perform model training to improve the prediction accuracy of the obtained recommendation object prediction model.

It should be noted that, for the recommendation object prediction model obtained in this embodiment, the recommendation object prediction model may be optimized by using the user access sequence updated by the sample user over time, so as to improve the prediction accuracy, and the optimization process of this embodiment is not described in detail herein.

Referring to fig. 2, a flowchart of an object recommendation method is provided for the embodiment of the present invention, where the method may be applied to a service side, that is, the method may be implemented by a server, and specifically includes, but is not limited to, the following steps:

step S101, acquiring a user access sequence;

the user access sequence may be generated based on an object output by a user access application platform, a specific generation manner of the user access sequence is not limited in this embodiment, and sequence elements included in the user access sequence may be object identifiers of access objects or keywords of the access objects as needed, and for user access sequences of different contents, generation manners of the user access sequences are often different, which may specifically refer to the description of the corresponding embodiments below.

In the present invention, a user access sequence whose sequence element is an object identifier may be referred to as an access object sequence, and a user access sequence whose sequence element is a keyword may be referred to as a keyword sequence, and the type of the user access sequence in the embodiment is not limited to the two sequences listed in the embodiment.

Step S102, inputting the user access sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object;

as described above, the recommendation object prediction model may be obtained by training user access sequences corresponding to a plurality of sample users based on a recurrent neural network, and for different training data, the representation forms of the obtained recommendation object prediction model may be different, and the meanings of the model output data representations may be different, but the processing logics of the models may be the same, and the present embodiment does not describe in detail the encoding calculation process of the recommendation object prediction model on the input sequences.

When a user access sequence is input to the recommendation object prediction model, each sequence element in the user access sequence is input to the recommendation object prediction model in sequence, and at this time, one input of the recommendation object prediction model may be a sequence element at a corresponding time.

Taking the schematic diagram of the recommended object prediction model architecture shown in fig. 3 as an example, to describe the processing flow of the user access sequence input model in a reduced pressure manner, the recommended object prediction model of this embodiment may include a plurality of GRU layers, that is, a plurality of GRU cells in fig. 3, and in combination with the description of the GRU principle, the input of each GRU layer is the previous-time hidden layer output and the current input, and the output is the next-time hidden layer state information.

In the calculation process of each GRU layer, the reset gate and the update gate included in the GRU layer are usually used to realize the calculation of the candidate hidden layer, and the control is to keep the information of how many previous hidden layers and control the information of how many candidate hidden layers are added to obtain the output. Therefore, the recommended object prediction model provided by the embodiment is adopted to realize coding calculation of the user access sequence, dependency information of long and short distances can be flexibly controlled by using a plurality of GRU layers, sequence data is suitable for depicting, namely, the article read by the user on an application platform for a long time is reserved, and the article read in the near term can be highlighted, so that the coding vector for predicting the user recommended object obtained by the embodiment not only considers the long-term interest and the short-term interest of the user, but also considers the access sequence of the user access object, improves the prediction accuracy, and meets the individual recommendation requirements of different users.

It should be noted that the coding vector obtained in step S102 may be a coding vector output by the last hidden layer in the recommended object prediction model. The recommended object prediction model is not limited to the schematic architecture shown in fig. 3, the middle layer may also include a plurality of LSTM layers, and the hidden layer calculation may be implemented based on the LSTM principle to obtain the encoding vector of the user access object.

Because the LSTM is a time-recursive neural network, and is generally suitable for processing and predicting important events with relatively long intervals and delays in a time sequence, it generally introduces three gating devices to deal with the problems of memory/forgetting, input degree, and output degree of a memory unit, and the structure is relatively complex; the GRU may introduce a Reset Gate and an Update Gate, which require few parameters, have a faster training speed and a relatively simple structure, and this embodiment may select which cyclic neural network is based on to implement the training of the recommended object model and the encoding calculation of the user access sequence of the current user according to actual needs, which is described herein only by taking the architecture diagram shown in fig. 3 as an example.

Step S103, similarity calculation is carried out on the coding vector and each candidate word vector;

it should be noted that the dimension of the encoding vector is consistent with that of the candidate word vector, and certainly, in the encoding calculation process, the dimension of the generated word vector is the same as that of the finally obtained encoding vector, so as to ensure normal operation of similarity calculation.

In different scenario embodiments, the candidate Word vector obtaining process may be different, for example, a Word vector directly obtained by accessing an object by using a user history, or a Word vector obtained by accessing a keyword by using a user history, and the like, and a specific generation manner of the Word vector is not limited in this embodiment, for example, word vectors of the accessed object or the keyword are obtained by using Word2Vec, but is not limited thereto.

In this embodiment, the specific implementation method of Word2Vec that generates corresponding Embedding by assigning a dense vector to each Word and comparing with the discrete feature processing-one-hot identification method can maintain semantic dimension information between words and words, and does not describe in detail the specific implementation method of generating corresponding Embedding by computing each access object or key for the Embedding Layer in the recommended object prediction model.

Optionally, in this embodiment, a similarity algorithm such as Cosine similarity (i.e., cosine similarity) may be used to implement the similarity calculation between two vectors, but the present embodiment is not limited to this similarity calculation method, and the description of the similarity calculation is only performed by taking this as an example.

The cosine similarity calculation adopts the following formula (6):

in formula (6), u and v represent the encoding vector and the candidate word vector of the user access object respectively, the two vectors have the same dimension, and u _i A characteristic value, v, representing the i-th dimension of the coded vector _i Representing the eigenvalue of the ith dimension in the candidate word vector.

Step S104, obtaining a recommendation object of the user based on the similarity calculation result;

optionally, in this embodiment, for a plurality of stored candidate objects, the higher the similarity between the corresponding word vector and the coding vector, the higher the probability that the corresponding candidate object becomes the recommendation object of the object is, that is, the higher the possibility that the corresponding candidate object becomes the prediction recommendation object is. It can be seen that the candidate objects with greater similarity have a greater probability of being of interest to the user.

Based on this, the embodiment may select a candidate object corresponding to the similarity reaching the preset threshold as the recommended object of the user; or selecting a plurality of candidate objects corresponding to the highest similarity as the recommended object of the user, and the like.

As another embodiment, if the candidate word vector is a word vector of a keyword, and similarly, the higher the similarity is, it indicates that the user is more interested in the corresponding keyword, and the corresponding keyword is selected as the candidate keyword, so as to determine the probability of the recommendation object to be larger, so to say, in this case, the similarity may represent the probability of the user being interested in the corresponding keyword, and may further represent the probability of the candidate object including the corresponding keyword becoming the recommendation object, in this embodiment, a plurality of keywords with higher similarities may be selected to determine the recommendation object that the user is most likely to be interested in, and a specific implementation process may refer to the description of the corresponding embodiment below.

In summary, in this embodiment, based on a recurrent neural network, a user access sequence corresponding to a plurality of sample users is trained to obtain a recommendation object prediction model, so as to implement coding calculation on the user access sequence, not only the long-term historical interest and the short-term historical interest of the user are taken into consideration, but also the access sequence of the user accessing objects on an application platform is taken into consideration, so that based on the calculation result of the similarity between the obtained coding vector and each candidate word vector, the obtained recommendation object of the user can accurately position interest transition and interest accumulation of the user, and the problem of diversity and personalized loss of the obtained recommendation object caused by the existing Item CF recommendation method is solved.

As an optional embodiment of the present invention, in the training process of the prediction model of the recommendation object, the history access object Item of the sample user on the application platform may be directly used as the access object sequence, the access object sequence may obtain training data for performing model training, and the model training may be performed based on the recurrent neural network to obtain the prediction model of the recommendation object. In this case, the object recommendation method shown in fig. 4 may be adopted by the present invention to obtain the recommended object of the current user, so as to use the recommended object as a candidate item to implement subsequent primary selection logic, and further obtain a target recommended object pushed to the user client through the sorting logic, so that the user can access the required object quickly and accurately.

As shown in fig. 4, a flowchart of another object recommendation method may also be applied to the server side, and specifically includes, but is not limited to, the following steps:

step S201, acquiring a plurality of pieces of historical access data of a user;

in practical application, a user logs in an application platform, the application platform usually outputs a plurality of access objects (such as articles, videos, pictures and the like), and because the content of the access objects is too much, the application platform often outputs summaries or tags of the access objects, and the user needs to enter a display interface of the access object to display the specific content of the access object.

For example, on a news reading application platform, a plurality of news headlines are usually output, and a user can select interesting news according to the headlines of the news and enter the news content display interface to read the detailed content of the news. In the video playing application platform, a plurality of videos are also displayed, so that each video also has a corresponding title or brief description for the convenience of user selection, and the user can enter the video playing interface to play the video content only when selecting the video of interest.

Accordingly, the application platform in this embodiment may be various APP application platforms commonly used by a user, such as an audio and video application platform, a browser application platform, an instant messaging application platform, and other social application platforms, and correspondingly, an access object output by the application platform and provided for the user to access may be a video, an article, and the like, which may be recorded as Item.

Moreover, during the object access operation of the user on the application platform, corresponding access data is generated, and the access data can be stored in a database of the application platform as historical access data to indicate the historical access behavior of the user on the application platform. It should be noted that, during the operation of the user on the application platform, besides the historical access data, other historical behavior data may also be generated.

The historical access data includes information such as an object identifier of the access object, content and title of the access object, and the object identifier indicates which object the user accesses, such as which article the user reads, which video the user watches, and so on, i.e., to distinguish the access objects. Therefore, the object identifier in this embodiment may be an object ID, and the content included in the history access data and the content referred by the object identifier are not limited in this embodiment.

Optionally, when the application platform stores historical behavior data (which includes historical access data), the historical behavior data may be classified and stored according to user identifiers, that is, different users operate on the application platform, and the generated historical behavior data may be stored after being associated with the user identifier of the user, so as to quickly find the historical behavior data of a certain user in the following. Based on this, the present embodiment may directly query the historical access data associated with the user identifier of the target user from the database of the application platform, but is not limited to this obtaining manner.

Step S202, an access object sequence is formed by object identifications respectively contained in a plurality of historical access data;

as described above, each piece of historical access data acquired in this embodiment may include an object identifier indicating a current access object, such as an object ID, after acquiring multiple pieces of historical access data within an effective duration, the object identifier included in each piece of historical access data may be extracted, and a corresponding access object sequence is generated according to a generation time of each piece of historical access data (i.e., an access time of a corresponding access object), that is, an element in the object sequence may be formed by object identifiers of access objects of a user on an application platform, and the object identifiers of the access objects are sorted according to a generation time distribution, for example, according to a sequence of the generation time, so as to obtain the object sequence.

For example, the following steps are carried out: if x1, x2, x3, x4, x5, x6, etc. are used to represent the object ID of each access object, the access object sequence generated in this embodiment may be [ x1, x2, x3, x4, x5, x6], and it can be seen that, from the content of the sequence element of the access object sequence, which access objects are available for the user in the effective time on the application platform can be determined. If the access object is an article, the articles read by the user in the effective time can be known, and the read articles are sorted according to the reading sequence, so that the interest change and the interest accumulation of the user can be accurately positioned. Step S203, inputting the access object sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object

The recommended object prediction model may be obtained by training access object sequences corresponding to a plurality of sample users based on a recurrent neural network, and the specific training process may refer to the description of the corresponding portion of the above embodiment.

After obtaining the recommendation object prediction model, the sequence elements included in the access object sequence may be sequentially input into the recommendation object prediction model, and the process of calculating the coding of the input sequence by the recommendation object prediction model may refer to the above description process of the RNN principle, which is not described in detail herein.

Referring to the schematic diagram of the recommended object prediction model architecture shown in fig. 3, in this embodiment, the input Layer inputs an access object sequence of a user, a sequence element of the access object sequence is an access object Item that the user has accessed on an application platform, and after the dimensionality reduction calculation of an Embedding Layer, an Embedding vector (which may be referred to as a word vector in this embodiment) corresponding to each access object Item is generated. The Embedding Layer of the recurrent neural network is actually a data dimension reduction processing Layer, and this embodiment can be specifically implemented by using Word2Vec, and the specific implementation method is not limited.

After the obtained embedding vector corresponding to each Item, that is, the embedding vector corresponding to each time, an intermediate layer composed of a plurality of GRU cells may be input, in the calculation process of each layer, the encoding calculation is performed by using the above formulas (2), (3), (4) and (5), that is, the output of the previous-time hidden layer (which is usually one encoding vector) and the input information of the current time (that is, the word vector corresponding to the current time) are continuously updated iteratively, and the finally output encoding vector is the encoding vector of the user access object.

It should be noted that the method of encoding and calculating the access object sequence to obtain the encoding vector is not limited to the model architecture diagram shown in fig. 3, and may be implemented by constructing a recommended object prediction model using a plurality of LSTM layers, and may be implemented by referring to the LSTM principle, and the embodiment is not described in detail here.

In addition, in the encoding calculation process of this embodiment, the dimension of the embedding vector generated by each access object Item needs to be consistent with the dimension of the encoding vector to ensure that the similarity calculation can be performed subsequently, and the specific content and dimension of the vector are not limited in this embodiment.

Step S204, a plurality of candidate access objects are obtained;

optionally, in this embodiment, a high-quality access object output by the application platform may be selected as a candidate object to form a candidate set, where the high-quality object may be an object with a higher access rate on the application platform, an object related to a trending topic in a recent social network, an object with a larger access amount in a recent period, an object collected by randomly outputting an object on the application platform, and the like.

In practical application, the candidate objects in the candidate set may be continuously updated according to the manner of selecting the candidate objects along with the change of time, so as to improve the accuracy of obtaining the recommended object of the user subsequently, and the updating manner of the candidate set is not limited in this embodiment.

Step S205, inputting each candidate access object into a language model to obtain a corresponding candidate word vector;

alternatively, the language model may be Word2Vec, but is not limited thereto. If each element (i.e. candidate object) in the candidate set is recorded as Xn, n is a positive integer, the specific numerical value is not limited, the candidate set obtained from these candidate objects may be [ X1, X2, X3 … … Xn ], and then an embedding vector (which may be recorded as a candidate word vector) of each candidate object may be calculated.

It should be noted that the above steps S204 and S205 may be executed at any step before the similarity calculation, and are not limited to the position described in the present embodiment. The embodiment may store the acquired candidate set, and when the similarity calculation is required, may calculate a candidate word vector corresponding to each candidate access object in the candidate set, that is, the manner described in the embodiment.

Of course, the embodiment may also pre-calculate the candidate word vector corresponding to each candidate access object in the candidate set, and directly store each candidate word vector, so that when similarity calculation with the encoding vector is required, the pre-stored candidate word vector corresponding to each candidate access object is directly obtained, online calculation is not required, and the work efficiency is improved. In this case, step S204 and step S205 may be combined as: the candidate word vectors corresponding to the multiple candidate access objects are obtained, other steps are the same, and this embodiment is not separately illustrated.

Step S206, similarity calculation is carried out on the coding vector and candidate word vectors of each candidate object;

it should be noted that, in this embodiment, a method for calculating the similarity between two vectors is not limited, and a Cosine similarity calculation method as described above may be adopted, a reciprocal calculation method of a distance may also be adopted, or a pearson correlation coefficient calculation method, and the like may also be adopted.

Step S207, based on the similarity calculation result, selects a recommended object of the user from the plurality of candidate objects.

After the similarity between each candidate access object and the prediction result is obtained, that is, the similarity between each candidate word vector and the encoding vector, the similarity may represent the probability of the user being interested in the corresponding candidate access object, and the greater the similarity, the more likely the user is interested in the corresponding candidate access object, and the corresponding access object is selected to be accessed. Therefore, in this embodiment, several candidate access objects with the highest similarity may be directly selected as recommendation objects, or candidate access objects with the similarity reaching a preset threshold may be selected as recommendation objects, and the like.

For example, the article sequence of the article read by the user U1 on a certain application platform is [ x ] _u1 ,x _u2 ,x _u3 ,x _u4 ,x _u5 ,x _u6 ]The candidate set of the application platform includes 10 candidate articles, such as [ X1, X2, X3, X4, X5, X6, X7, X8, X9, X10 ]]According to the method, the article sequence of the article read by the user is coded and calculated to obtain a coding vector E _u1 Calculate E _u1 Candidate word vector E corresponding to each candidate article _j Cosine similarity betweenThe degree j represents the first candidate article, and the candidate articles corresponding to the three word vectors with the highest similarity (or the word vectors with the similarity reaching the preset threshold) are determined to be X3, X5 and X7, which may be used as the recommended articles in this embodiment, so as to form a recall result obtained by the recall logic, so that the target recommended article is finally pushed to the user U1 client from the subsequent screening logic in the recommendation system.

In summary, in this embodiment, the training data of the model is formed by the access objects of the user, and the coding calculation of the access object sequence of the access objects of the user is implemented based on the recommendation object prediction model obtained by the cyclic neural network training, and the obtained coding vector takes into account the long-term historical interest and the short-term historical interest of the user, so as to improve the accuracy of the target recommendation object finally pushed to the user.

After the inventor of the present invention proposes the object recommendation method described in the above embodiment, the inventor finds that, although this method solves the problem of the object recommendation method based on Item CF in the prior art, in the process of performing similarity calculation, the order of magnitude of candidate access objects included in a candidate set is often very large, especially, as for various current instant messaging application programs, the number of access objects output by an application platform is in the order of ten million, and with the increase of the user scale of the application platform and the expansion of the source of the access objects, the candidate set is only larger and larger, and will reach the level of one billion or even one billion in the future, and if the similarity calculation is performed on candidate word vectors of each candidate access object, an online system cannot perform real-time calculation at all. It can be seen that in the object recommendation method proposed in the above embodiment, the huge number of candidate sets becomes an important problem in implementation.

In view of this problem, the inventor proposes that a plurality of candidate access objects are screened out according to a certain sampling strategy for the access objects that can be output by the application platform to form a candidate set.

Therefore, in order to further improve the prediction accuracy, the inventor proposes an improvement on the above scheme, taking the articles read by the user as an access object as an example for analysis, and considering that each article is composed of keywords, although the number of articles output by the application platform is increased, the total number of keywords composing the article is a relatively stable set, which usually does not change greatly due to the increase of the user scale and the expansion of the article source. Therefore, in this embodiment, the encoding calculation may be performed on the access object sequence (here, the article sequence) of the user in the foregoing embodiment, and the encoding calculation may be performed on the keyword sequence instead, and the specific implementation process may refer to the method described in the following embodiment, but is not limited to the implementation manner of the following embodiment.

Referring to fig. 5 and fig. 6, a flow chart of another object recommendation method provided in the embodiment of the present invention is schematically illustrated, and the method may still be applied to a server, and specifically may include the following steps:

step S301, acquiring a plurality of pieces of historical access data of a user;

step S302, obtaining a keyword cluster corresponding to a corresponding access object based on each historical access data;

in combination with the analysis of the historical access data, the present embodiment can learn, based on the obtained historical access data, the access condition of the user to the application platform output object at each past time, that is, what objects have been accessed, that is, the access sequence of each access object. After the titles or contents of the access objects of the user on the application platform are obtained, corresponding keywords can be extracted from the titles or contents, and corresponding keyword clusters are generated.

For example, the following steps are carried out: it is assumed that the object sequence of the user U2 is [ x1, x2, x3], that is, within the valid duration, there are three access objects of the user U2 on the application platform, the object sequence may be formed by corresponding object IDs, the keyword extracted from the access object corresponding to each object ID may be denoted as Tagnn, and n is an integer.

After obtaining the keyword clusters corresponding to the access objects, the present embodiment may generate an access object-keyword mapping relationship shown in table 1 below, so as to facilitate subsequent direct query. As can be seen, in step S302, the object identifier of each access object can be obtained by using each historical access data, and then, the access object-keyword mapping relationship is directly queried to obtain the keywords included in the corresponding access object, so as to obtain the keyword cluster corresponding to each access object. It should be noted that the expression of the access object-keyword mapping relationship is not limited to the form shown in table 1 below.

TABLE 1 Access object-keyword mapping relationship

Object identification	Keyword
		x1	Tag11，Tag12
x2	Tag21，Tag22，Tag23
		x3	Tag31，Tag32

Step S303, forming a keyword sequence by the keyword cluster corresponding to each access object;

optionally, in order to better represent boundary information of the access object, in this embodiment, a virtual keyword head and a virtual keyword tail may be added to the keyword cluster corresponding to each access object, and the virtual keyword head and the virtual keyword tail are respectively recorded as a head-tag and a tail-tag, where the content of the head-tag and the content of the tail-tag added to the keyword cluster corresponding to each access object may be the same and may be fixed and unchanged after the content is determined, and the content represented by the head-tag and the tail-tag respectively is not limited in this embodiment.

Based on this, referring to fig. 7, still taking the access object sequence of the user U2 as [ x1, x2, x3] and the access object-keyword mapping relationship shown in table 1 as an example, the keyword clusters corresponding to each access object are [ Tag11, tag12]; [ Tag21, tag22, tag23]; [ Tag31, tag32], each keyword cluster can be understood as a Tag sequence of an article, i.e., each keyword cluster can be mapped to at least one access object, such as access object 1, access object 2, etc., as shown in fig. 7. In this embodiment, the keyword clusters of each access object may be used as sequence elements according to the access time of each access object corresponding to each keyword cluster, so as to generate a keyword sequence, that is [ head-Tag, tag11, tag12, tag-Tag, head-Tag, tag21, tag22, tag23, tag-Tag, head-Tag, tag31, tag32, tag-Tag ].

Therefore, the sequence obtained for inputting the model for the historical access data of the user U2 on the application platform is changed from [ x1, x2, x3] to [ head-Tag 11, tag12, tag-Tag, head-Tag 21, tag22, tag23, tag-Tag, head-Tag, tag31, tag32, tag-Tag ].

Step S304, inputting the keyword sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object;

in this embodiment, the keyword sequences corresponding to a plurality of sample users may be obtained, and training data used for model training is obtained, so that model training is performed on the training data based on the recurrent neural network, and the recommended object prediction model used in step S304 is obtained, and the model training process may refer to the description of the corresponding part of the above embodiment.

The encoding process of inputting the keyword sequence into the recommended object prediction model is similar to the above encoding process of the object sequence, but the difference is that the access object in the above embodiment is input and becomes the keyword of the access object, and the encoding process is not described in detail in this embodiment. Therefore, the coding vector obtained in this embodiment can be used to predict the keyword of the access object that the user is interested in, and the access object that the user is interested in cannot be directly predicted.

Step S305, similarity calculation is carried out on the coding vector and candidate word vectors corresponding to the candidate keywords;

different from step S206 in the above optional embodiment, in the present embodiment, what is performed for similarity calculation with the coded vector is the candidate word vector corresponding to the candidate keyword, but not the candidate word vector corresponding to the candidate access object, and at the same time, the elements in the candidate set in the present embodiment are no longer candidate access objects, but become candidate keywords.

The embodiment generates a candidate set, the selected candidate keywords may be keywords with a high access frequency on the application platform, keywords related to the social network hotspot topic, and the like, and the embodiment does not limit how to select the candidate keywords. The method for calculating the similarity between vectors is similar to the method for calculating the similarity described in the above embodiments, such as the method for calculating the cosine similarity, but not limited thereto, and details of the implementation process of step S305 are not repeated in this embodiment.

Step S306, obtaining candidate keywords corresponding to a first number of candidate word vectors with highest similarity;

the specific value of the first amount is not limited in this embodiment, and may be set empirically or experimentally. Moreover, in the screening of the candidate keywords, a similarity standard for the screening, that is, a preset similarity threshold value, may also be preset, and then step S306 may be changed to obtain the candidate keywords corresponding to the candidate word vector with the similarity reaching the preset similarity threshold value, and certainly, the screening of the candidate keywords may also be implemented in other manners, which are not limited to the manners given herein.

Step S307, forming a candidate keyword cluster by the acquired first number of candidate keywords;

it should be noted that step S307 is for convenience of describing the object recommendation method provided in this embodiment, so in practical applications of this embodiment, the screened candidate keywords may be directly used to perform subsequent steps, and step S307 does not necessarily need to be performed, or step S307 is implicitly performed in the process of performing step S306.

Step S308, using a candidate keyword contained in the candidate keyword cluster and the hidden layer output at the previous moment in the recommended object prediction model as the hidden layer input at the current moment, and continuing to perform coding calculation;

in this embodiment, the similarity calculation process may be regarded as a decoding process, and the present embodiment may adopt a multi-decoding manner to obtain a plurality of candidate keyword sequences for obtaining the recommendation object of the user.

Optionally, this embodiment may implement multiple decoding by using a seq2seq technique. The seq2seq is actually a recurrent neural network of an Encoder-Decoder structure, with the input being a sequence and the output also being a sequence. The Encoder layer may process the Encoder layer to convert a variable-length signal sequence into a fixed-length vector for expression, that is, encode an input into a vector, and the Decoder layer may convert the fixed-length vector into a variable-length target signal sequence, that is, predict a possible output object by combining the encoded vector.

Based on this, after obtaining the candidate keyword cluster, the embodiment may arbitrarily select one of the candidate keywords, or select one candidate keyword in sequence, or select a candidate keyword with the highest similarity, and the like, and input the candidate keyword as a model to continue encoding calculation, that is, the selected candidate keyword is used as a current input, and the previous hidden layer in the recommended object prediction model is used to output, and continue encoding calculation to obtain a new encoding vector, which is similar to the calculation manner of the hidden layer after sequentially inputting each sequence element into the recommended object prediction model described in the above embodiment, and the detailed description of the embodiment is omitted here.

Step S309, using the obtained new code vector as the code vector of the user access object, and returning to step S305;

step S310, detecting that the similarity calculation times reach a second number, and generating a first number of candidate keyword sequences by using the same-dimension candidate keywords in the formed second number of candidate keyword clusters;

in this embodiment, the number of times of encoding for multiple times is not limited, that is, the numerical value of the second quantity is not limited, the numerical value of the second quantity may be set according to experience or experiment, or may be determined according to the similarity of the candidate keyword obtained by the present encoding, and the like; and the condition that whether the formed candidate keyword cluster meets the preset condition, such as whether the similarity of the candidate keywords in the candidate keyword cluster reaches a certain threshold or whether the similarity of the candidate keywords obtained each time tends to be stable, can be further used for coding.

As described above, in the practical application of this embodiment, assuming that each decoding calculation, that is, each time the similarity calculation is performed, K candidate keywords with the highest similarity are screened, that is, the first number is K, and after the decoding is performed for T times according to the above-described manner, that is, the second number is T, T candidate keyword clusters with the length of K are obtained, then candidate keywords with the same dimension may be sequentially extracted from these keyword clusters to generate corresponding candidate keyword sequences, for example, the K candidate keyword sequences obtained each time are taken as a column of a matrix, so that T column data is obtained, and the candidate keywords in the same row are taken to form a predicted candidate keyword sequence, which is denoted as [ Tagm1, tagm2, tagm3, …, tagmT ], m =1,2,3, …, K, and it can be seen that the present embodiment can obtain K candidate keyword sequences with the length of T.

It should be noted that the candidate keyword sequence generation manner of this embodiment is not limited to the above-described same-latitude candidate keyword extraction manner, and one candidate keyword may also be extracted from each candidate keyword cluster to generate, so that a large number of candidate keyword sequences are obtained, the processing manners of other subsequent steps are the same, and this embodiment is not separately described.

Step 311, obtaining a recommendation object of the user based on the candidate keywords included in the first number of candidate keyword sequences.

Optionally, in this embodiment, the reverse index from the keyword of each object that can be output by the application platform to the access object may be pre-established, and a specific construction method is not described in detail in this embodiment. The inverted index may be referred to as an inverted index, a posting archive, or an inverted archive, and may be used to store a mapping of a storage location of a word in a document or a group of documents under a full-text search, where the storage location stores a mapping of a keyword to an access object.

Based on this, after obtaining a plurality of subsequent keyword sequences, that is, [ Tagm1, tagm2, tagm3, …, tagmT ], m =1,2,3, …, K, in an access object that can be output by an application platform, a candidate access object including each candidate keyword is sequentially pulled, that is, at least one candidate access object to which each candidate keyword is mapped is obtained, so as to generate an inverted arrangement table corresponding to each candidate keyword, for example, a candidate access object x2 including Tagm1 is pulled, at this time, the candidate access object x2 may further include other candidate keywords, that is, the candidate keywords may be mapped to the same candidate access object, and then, by calculating the number of times that each candidate access object is mapped, according to the number of times, a recommended object of a user is screened from the mapped candidate access objects, and a specific implementation process is not limited.

Of course, if the word order of the candidate keyword sequence is reserved, the candidate keyword sequences may be spliced according to the word order in this embodiment, and then, the similarity between the spliced candidate keyword sequence and the keyword sequences corresponding to the candidate access objects may be calculated in a text similarity calculation manner, so as to select a preset number of candidate access objects with the highest similarity as the recommendation object of the user.

It can be seen that the implementation manner of step S311 is not limited to a certain manner, and the recommendation object of the user can be obtained by any one manner given above, and is not limited to the two manners described above.

In summary, in the embodiment, the training data of the model is formed by using the keywords in the user access object, the coding calculation of the keyword sequence of the user access object is realized based on the recommended object prediction model obtained by the recurrent neural network training, the coding vector of the keyword which is interested by the predicted user is obtained, and based on the characteristics of the recurrent neural network, the coding mode takes into account the long-term historical interest and the short-term historical interest of the user, so that the accuracy of the target recommended object which is finally pushed to the user is improved.

In addition, because the candidate word vectors calculated in the similarity of the present embodiment are word vectors of keywords, the number of the candidate word vectors is much smaller than that of the word vectors of the access object, and the candidate word vectors to be calculated do not change much with the change of time, so that the calculation amount is reduced, and the accuracy and stability of object recommendation are improved.

Optionally, the following two ways are provided for the implementation method of step S311 in the above alternative embodiment, but not limited to the two ways given below:

the first method is as follows:

referring to the flow diagrams shown in fig. 6 and 8, the method may include the following steps:

a1, obtaining an inverted index of a constructed keyword mapped to an object;

the present embodiment does not limit the construction manner of the reverse index in which the keywords in the current application platform are mapped to the object.

Step A2, inquiring the inverted index to obtain an inverted list of each candidate keyword in each candidate keyword sequence;

wherein the inverted list is used for characterizing at least one candidate recommendation object mapped by the candidate keyword. As described above, for each candidate keyword, the candidate access objects at least including the candidate keyword are pulled back from the access objects that can be output by the application platform, or the candidate access objects to which each candidate keyword is mapped are obtained, so as to generate the inverted list of each candidate keyword, and what candidate access objects including the corresponding candidate keyword are can be quickly obtained through the inverted list.

Step A3, counting the mapping times of each candidate recommendation object in each candidate keyword sequence based on the obtained inverted arrangement list of each candidate keyword;

as can be seen from the above analysis, for the candidate access object pulled back in this embodiment, the greater the number of candidate keywords (i.e., the candidate keywords in each of the obtained candidate keyword sequences) contained in the candidate access object, that is, the greater the number of times the candidate access object is pulled back, the greater the probability that the candidate access object is more likely to be recommended to the user, that is, the recommended object is. Therefore, the present embodiment may determine the score corresponding to each candidate access object according to a certain rule based on the number of times that each candidate access object is pulled back, and then implement the subsequent steps according to the score of each candidate access object. In this embodiment, the larger the score of the candidate object is, the larger the probability that the user is interested in the candidate access object is, and the larger the probability that the candidate access object is taken as the recommendation object is.

Optionally, in practical applications, the embodiment may further perform statistics on the keyword coverage of each candidate access object on each candidate keyword sequence, where the larger the keyword coverage is, the larger the probability that the corresponding candidate access object becomes a recommended object is, the embodiment may directly perform subsequent processing on the basis, and may also obtain the score of each subsequent access object, where the score represents the probability that the corresponding candidate access object becomes the recommended object, and the like, and the specific calculation method of the keyword coverage is not limited in the embodiment.

And A4, screening the recommended objects of the user from the obtained multiple candidate recommended objects based on the statistical result.

In this embodiment, since the probability that the candidate recommendation object with the greater number of mappings is filtered as the recommendation object is greater, that is, the greater the number of candidate keywords corresponding to the candidate access object is, the greater the probability that the candidate access object is recommended to the user is, in this embodiment, the p candidate objects with the greatest number of corresponding candidate keywords can be filtered as the recommendation object of the user.

Specifically, the present embodiment may rank the number of times that each candidate access object is mapped, and select p candidate access objects as recommendation objects according to the order of the number of times that the candidate access objects are mapped from large to small, but is not limited to this implementation.

The second method comprises the following steps:

referring to the flow diagrams shown in fig. 6 and 9, the method may include the following steps:

step B1, splicing the candidate keywords contained in the first number of candidate keyword sequences according to the word sequence of each candidate keyword in each candidate keyword sequence;

step B2, obtaining keyword sequences corresponding to a plurality of candidate access objects respectively, wherein the candidate access objects at least comprise one candidate keyword in any candidate keyword sequence;

it should be noted that, in practical application of the present embodiment, step B2 may be executed before step B1, and is not limited to this step sequence of the present embodiment.

Optionally, when the keyword sequences of the candidate access objects are too many, in order to reduce the similarity calculation workload, the embodiment may filter the candidate access objects in an inverted list manner, for example, filter the candidate access objects that at least include the candidate keywords in one candidate keyword sequence, so as to complete the subsequent similarity calculation.

B3, performing text similarity calculation on the candidate keyword sequence obtained by splicing and the keyword sequence corresponding to each candidate access object;

in practical application of this embodiment, the keyword sequence obtained after the concatenation may be regarded as a text, and therefore, in this embodiment, a text similarity calculation mode may be adopted to calculate similarities between the candidate keyword sequence and the keyword sequences corresponding to the candidate access objects, so as to screen out the recommendation objects. The embodiment does not describe the specific implementation process of the text similarity calculation method in detail.

And step B4, screening the recommended objects of the user from the candidate access objects based on the text similarity calculation result.

The probability that the candidate object with higher similarity is screened as the recommended object is higher, so that in the embodiment, a preset number of candidate access objects with highest similarity may be screened as the recommended objects from the candidate access objects, and the candidate access objects with similarity reaching a preset threshold may also be screened as the recommended objects, and the like, and the specific implementation process of step B4 is not limited in the embodiment.

Based on the recommendation objects of the users obtained in the foregoing embodiments, in practical application, the recommendation objects of the users may be used as candidates of the target recommendation object, that is, after the recommendation objects of the users are obtained, the recommendation objects of the users may be further screened by using some logics, and the steps of the object recommendation method described in the foregoing embodiments may be used in practical application: according to the image information of a specific user, data pulling is performed according to dimensions such as various accurate personalization, general personalization, heat and the like, that is, recommended objects that may be interested by the user are pulled, and generally, the number of the recommended objects obtained at this time is large, and the recommended objects can be further screened.

As shown in fig. 10, which is a schematic flow diagram of another object recommendation method provided in an embodiment of the present invention, reference may be made to the description of the foregoing embodiments in the implementation process of obtaining a recommended object of a user in the method, which is not described herein again, and only a processing process after obtaining the recommended object of the user is described here, so that the method may further include the following steps:

step S401, according to a specific rule, preliminarily screening the recommended objects of the user to obtain preliminarily selected recommended objects;

the specific rule may be determined by factors in the aspects of relevance, timeliness, regions, diversity and the like of the user access object, and the specific content included in the specific rule is not limited in this embodiment.

Fig. 11 is a schematic diagram of a recommendation system for an application scenario, where the application scenario may be a recommendation scenario for an information presentation platform in an instant messaging client to output information, and as shown in fig. 11, the recommendation system may include a plurality of functional models, such as recall logic, primary selection logic, and ranking logic (i.e., rank in fig. 11), where the recall logic implements a process of acquiring a recommended object of a user described in the foregoing embodiments, the primary selection logic is used to implement the implementation process of step S401, and the ranking logic is used to implement a subsequent ranking processing process of the primary selection recommended object.

Step S402, obtaining coding vectors corresponding to all the initially selected recommended objects and coding vectors of the currently accessed objects;

in practical application of this embodiment, in addition to a manner of directly calculating the similarity between the initially selected recommended object and the currently browsed object and sorting the initially selected recommended objects according to the size of the similarity, the present embodiment may also adopt the similarity calculation manner described above to obtain the coding vectors corresponding to the accessed objects, so that sorting of the initially selected recommended objects is realized through the similarity between the vectors.

Based on this, in this embodiment, the encoding vector corresponding to each initially selected recommended object may be obtained in the manner described above, for example, each initially selected recommended object is used as a sequence element to generate an access object sequence, and the access object sequence is sequentially input to the recommended object prediction model to perform encoding calculation, so as to obtain a corresponding encoding vector.

Step S403, similarity calculation is carried out on the coding vector corresponding to each initially selected recommended object and the coding vector of the current access object;

optionally, in this embodiment, a cosine similarity calculation method may be adopted to obtain the similarity between vectors, and the specific implementation process may refer to the description of the corresponding part in the above embodiment, but the method for calculating the similarity between vectors is not limited to this implementation method.

Step S404, selecting a preset number of primary selection recommendation objects with highest similarity as target recommendation objects;

it should be noted that the implementation method for filtering the target recommendation object from the plurality of initially selected recommendation objects in the present embodiment is not limited to the manner described in the present embodiment, and the present embodiment is not listed here, which is similar to the method for filtering the recommendation object of the user from the plurality of candidate access objects described in the foregoing embodiment.

Step S405, the target recommendation object is sent to a client of the user for displaying.

In summary, in the embodiment, according to the manner described in the above embodiment, the obtained recommendation object of the user can not only accurately locate the interest transition and interest accumulation of the user, but also give consideration to the long-term interest and the short-term interest of the user, and the target recommendation object of the user screened from such recommendation objects can better meet the current requirements of the user, so that the application server adopting the object recommendation method of the embodiment can better provide recommendation service for the user.

Taking the application recommendation system shown in fig. 11 as an example, a specific description is given in the scenario of the object recommendation method provided in the foregoing embodiment, in this embodiment, only an access object is taken as an article output by an application platform, and a description is given based on an object sequence as an example of training data, for a server of the application, the server may obtain article sequences of multiple users on the application platform, and thus obtain a training model, perform model training based on a recurrent neural network to obtain a recommendation object prediction model, and then, the server may mark any user using the application as a target user, may obtain an article sequence of the target user in the foregoing manner, sequentially input the article sequence into the recommendation object prediction model, obtain an encoding vector for predicting an article that the user may be interested in, and perform similarity calculation on word vectors of the encoding vector and word vectors of multiple preset candidate articles, thereby select multiple candidate articles with the highest similarity as a recommendation article of the target user, that is a recall result of execution of recall logic in fig. 11. Certainly, in the model training process, the keyword sequences corresponding to multiple users may also be used to form training data, so that when predicting the recommended article of the target user, the keyword sequence of the target user is obtained, after the keyword sequence is input into the recommended object prediction model, the output coding vector may predict keywords that may be interested by the user, then the similarity between the coding vector and the word vectors of the candidate keywords is calculated respectively (which may be considered as a decoding process), K candidate keywords with the highest similarity are selected, after multiple decoding, multiple keyword sequences with the length of T are generated, and then the recommended article of the target user, that is, the article recall result of the target user, is obtained according to the above-mentioned two processing modes of text recall logic, such as the inverted index or text similarity calculation mode.

And then, screening the recalled recommended articles by using the primary selection logic to obtain a plurality of primary selection articles, sequencing the plurality of primary selection articles by using the Rank logic, sequencing the primary selection articles by using the recommended article obtaining mode described above to obtain the target recommended articles, and displaying the target recommended articles on a display interface of a target user using a client.

In practical application, the above processing procedure may be implemented by a server online, and when a user uses a client to communicate with the server, the server may directly feed back a target recommendation article associated with the user identifier to the client for display according to the user identifier of the user, but is not limited to this implementation.

Referring to fig. 12, a schematic structural diagram of an object recommendation apparatus provided in this embodiment, which may be applied in a server, may include, but is not limited to, the following constituent structures:

a sequence obtaining module 11, configured to obtain a user access sequence;

wherein the user access sequence is generated based on the user accessing the object output by the application platform;

the coding calculation model 12 is used for inputting the user access sequence into a recommended object prediction model for coding calculation to obtain a coding vector of the user access object;

the recommendation object prediction model is obtained by training user access sequences corresponding to a plurality of sample users based on a recurrent neural network, the recurrent neural network comprises a plurality of gating recurrent unit layers or a plurality of long-term and short-term memory network layers, and specific network structures and principles can refer to descriptions of corresponding parts of the method embodiments.

A first similarity calculation module 13, configured to perform similarity calculation on the coding vector and each candidate word vector;

and the recommended object selection module 14 is configured to obtain a recommended object of the user based on the similarity calculation result.

Optionally, as shown in fig. 13, the training data content required for training the recommended object prediction model is different, the obtained sequence element content of the user access sequence is different, and the manner of obtaining the user recommended object is also changed accordingly.

Based on this, the sequence acquiring module 11 may include:

a first data acquisition unit 1110 configured to acquire pieces of historical access data of a user, the historical access data being generated based on an access operation of the user to an application platform output object;

a first sequence forming unit 1111, configured to form an access object sequence from object identifiers respectively included in the plurality of pieces of historical access data.

In this case, as shown in fig. 13, the apparatus provided in this embodiment may further include:

a candidate access object obtaining module 15, configured to obtain a plurality of candidate access objects;

a first word vector obtaining module 16, configured to input each candidate access object into the language model to obtain a corresponding candidate word vector

Accordingly, the first similarity calculation module 13 may be specifically configured to perform similarity calculation on the encoding vector and a candidate word vector corresponding to a candidate access object.

For the similarity calculation method between vectors, reference may be made to the description of the corresponding parts of the above method embodiments.

The recommended object selection module 14 may be specifically configured to select a preset number of candidate access objects with the highest similarity as recommended objects of the user; or selecting a candidate access object with the similarity reaching a preset threshold as a recommendation object of the user, and the like.

As another embodiment of the present application, as shown in fig. 14, the sequence acquiring module 11 may include:

a second data obtaining unit 1120 for obtaining a plurality of pieces of historical access data of the user;

a keyword cluster obtaining unit 1121, configured to obtain, by using each historical access data, a keyword cluster corresponding to a corresponding access object;

second sequence forming section 1122 forms a keyword sequence from the keyword cluster corresponding to each access target.

In this embodiment, the implementation process of obtaining the keyword sequence may refer to the description of the corresponding part of the above method embodiment.

At this time, the apparatus may further include:

the candidate keyword acquisition module is used for acquiring a plurality of candidate keywords;

and the second word vector acquisition module is used for respectively inputting the candidate keywords into the language model to obtain corresponding candidate word vectors.

Accordingly, the first similarity calculation module 13 may be specifically configured to perform similarity calculation on the candidate word vectors corresponding to the coding vector and each candidate access keyword.

In practical application, since the number of keywords of the access object output by the application platform is relatively stable and does not increase rapidly with the increase of the user of the application platform and the source of the output object, the calculation amount of similarity calculation in this embodiment is relatively small, and the obtained recommended object of the user is relatively stable.

Optionally, in a case that the candidate set is a candidate keyword, as shown in fig. 14, the recommended object selecting module 14 may include:

a candidate keyword obtaining unit 141, configured to obtain candidate keywords corresponding to a first number of candidate word vectors with highest similarity;

a keyword cluster generating unit 142, configured to form a candidate keyword cluster from the acquired first number of candidate keywords;

the coding calculation unit 143 is configured to continue coding calculation by using a candidate keyword included in the candidate keyword cluster and a previous hidden layer output in the recommended object prediction model;

a similarity calculation terminating unit 144, configured to use the obtained new coding vector as a coding vector of a user access object, and execute the step of performing similarity calculation on the coding vector and each candidate word vector until the similarity calculation frequency reaches a second number, or a formed candidate keyword cluster meets a preset condition;

a keyword sequence generating unit 145, configured to generate a first number of candidate keyword sequences by using the same-dimension candidate keywords in the second number of candidate keyword clusters;

a recommended object obtaining unit 146, configured to obtain a recommended object of the user based on the candidate keywords included in the first number of candidate keyword sequences.

Optionally, the recommended object obtaining unit 146 may specifically include:

the reverse index acquiring subunit is used for acquiring a reverse index of the constructed keywords mapped to the object;

the query subunit is used for querying the inverted index to obtain an inverted list of each candidate keyword in each candidate keyword sequence;

wherein the inverted list is used for characterizing at least one candidate recommendation object mapped by the candidate keyword.

The statistic subunit is used for counting the mapping times of each candidate recommendation object in each candidate keyword sequence based on the obtained inverted arrangement list of each candidate keyword;

and the first screening subunit is used for screening the recommendation object of the user from the acquired multiple candidate recommendation objects based on the statistical result.

Wherein the candidate recommended objects mapped more frequently have a higher probability of being filtered as recommended objects.

As another alternative embodiment, the recommended object obtaining unit 146 may also include:

the splicing subunit is configured to splice the candidate keywords included in the first number of candidate keyword sequences according to the word order of each candidate keyword in each candidate keyword sequence;

the keyword sequence acquisition subunit is configured to acquire keyword sequences corresponding to a plurality of candidate access objects, where each candidate access object at least includes a candidate keyword in any candidate keyword sequence;

the similarity calculation operator unit is used for performing text similarity calculation on the candidate keyword sequence obtained by splicing and the keyword sequence corresponding to each candidate access object;

and the second screening subunit is used for screening the recommended objects of the user from the candidate access objects based on the text similarity calculation result.

Wherein the higher the similarity, the higher the probability that the candidate access object is screened as the recommended object.

It should be noted that, with regard to the function implementation process of each functional module or unit in the foregoing device embodiment, reference may be made to the description of the corresponding part of the foregoing method embodiment.

Optionally, on the basis of the foregoing embodiments, as shown in fig. 15, the apparatus may further include:

the preliminary screening module 17 is configured to perform preliminary screening on the recommended objects of the user according to a specific rule to obtain preliminary-selected recommended objects;

a code vector obtaining module 18, configured to obtain a code vector corresponding to each initially recommended object and a code vector of a currently accessed object;

the second similarity calculation module 19 is configured to perform similarity calculation on the coding vector corresponding to each initially selected recommended object and the coding vector of the currently accessed object;

the target recommended object selection module 120 is configured to select a preset number of primarily selected recommended objects with the highest similarity as target recommended objects;

and the target recommendation object sending module 121 is configured to send the target recommendation object to the client of the user for display.

In summary, in this embodiment, based on a recurrent neural network, a user access sequence corresponding to a plurality of sample users is trained to obtain a recommendation object prediction model, and coding calculation of the user access sequence is implemented, so that not only long-term historical interest and short-term historical interest of the user are considered, but also an access sequence of the user for accessing objects on an application platform is considered, and thus, based on a calculation result of similarity between an obtained coding vector and each candidate word vector, a recommendation object of the user can be obtained to accurately position interest transition and interest accumulation of the user, and the problem of diversity and personalized loss of the obtained recommendation object caused by the existing Item CF recommendation method is solved.

According to different requirements, the content of training data used for model training, namely the content of a user access sequence, can be flexibly selected, and the flexibility of the object recommendation method is improved.

An embodiment of the present invention further provides a computer device, where a hardware structure of the computer device may be as shown in fig. 16, and the hardware structure of the computer device may include: a communication interface 1, a memory 2, and a processor 3;

in the embodiment of the present invention, the communication interface 1, the memory 2, and the processor 3 may implement mutual communication through a communication bus, and the number of the communication interface 1, the memory 2, the processor 3, and the communication bus may be at least one.

Optionally, the communication interface 1 may be an interface of a communication module, such as an interface of a GSM module;

the processor 3 may be a central processing unit CPU or an Application Specific Integrated Circuit ASIC or one or more Integrated circuits configured to implement embodiments of the present invention.

The memory 2 may comprise a high-speed RAM memory and may also include a non-volatile memory, such as at least one disk memory.

Wherein, the memory 2 stores a computer program, and the processor 3 calls the computer program stored in the memory 2 to realize the steps of the object recommendation method applied to the computer device;

optionally, the computer program is primarily operable to:

In practical applications of this embodiment, the computer device may be an application server, such as an application server corresponding to various instant messaging clients.

The embodiment of the present invention further provides a storage medium, where the storage medium records a computer program that is suitable for being executed by a processor of a computer device to implement the steps of the object recommendation method applied to the computer device, and the implementation process of the object recommendation method may refer to the description of the corresponding parts of the above method embodiment.

The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device and the computer equipment disclosed by the embodiment correspond to the method disclosed by the embodiment, so that the description is relatively simple, and the relevant points can be referred to the method part for description.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. An object recommendation method, characterized in that the method comprises:

similarity calculation is carried out on the coding vectors and all candidate word vectors, and the candidate word vectors correspond to candidate objects or candidate keywords in a candidate set;

obtaining candidate keywords corresponding to a first number of candidate word vectors with highest similarity;

forming a candidate keyword cluster by the acquired first number of candidate keywords;

outputting a candidate keyword contained in the candidate keyword cluster and a hidden layer at the last moment in the recommended object prediction model as hidden layer input at the current moment in the recommended object prediction model by using the candidate keyword and the hidden layer at the last moment in the recommended object prediction model, and continuing coding calculation;

taking the obtained new coding vector as a coding vector of a user access object, and executing the step of calculating the similarity between the coding vector and each candidate word vector until the similarity calculation times reach a second number or a formed candidate keyword cluster meets a preset condition;

generating a first number of candidate keyword sequences by using the same dimension candidate keywords in the formed second number of candidate keyword clusters;

and obtaining the recommended object of the user based on the candidate keywords contained in the first number of candidate keyword sequences.

2. The method of claim 1, wherein the recurrent neural network comprises a plurality of layers of gated recurrent units, or a plurality of layers of long-short term memory networks.

3. The method according to claim 1 or 2, wherein the user access sequence is an access object sequence, and the obtaining the user access sequence comprises:

acquiring a plurality of pieces of historical access data of a user, wherein the historical access data are generated based on the access operation of the user to an output object of an application platform;

and forming an access object sequence by object identifications respectively contained in the plurality of pieces of historical access data.

4. The method of claim 3, further comprising:

acquiring a plurality of candidate access objects;

and inputting each candidate access object into a language model to obtain a corresponding candidate word vector.

5. The method according to claim 1 or 2, wherein the user access sequence is a keyword sequence, and the obtaining the user access sequence comprises:

acquiring a plurality of pieces of historical access data of a user;

obtaining a keyword cluster corresponding to a corresponding access object by using each historical access data;

and forming a keyword sequence by the keyword cluster corresponding to each access object.

6. The method of claim 5, further comprising:

acquiring a plurality of candidate keywords;

and respectively inputting the candidate keywords into a language model to obtain corresponding candidate word vectors.

7. The method according to claim 1, wherein obtaining the recommended object of the user based on the candidate keywords included in the first number of candidate keyword sequences comprises:

acquiring an inverted index of the constructed keywords mapped to the object;

inquiring the inverted index to obtain an inverted arrangement table of each candidate keyword in each candidate keyword sequence, wherein the inverted arrangement table is used for representing at least one candidate recommendation object mapped by the candidate keyword;

counting the mapping times of each candidate recommendation object in each candidate keyword sequence based on the obtained inverted arrangement list of each candidate keyword;

and screening the recommendation object of the user from the acquired plurality of candidate recommendation objects based on the statistical result, wherein the candidate recommendation objects with higher mapping times have higher probability of being screened as the recommendation object.

8. The method according to claim 1, wherein obtaining the recommended object of the user based on the candidate keywords included in the first number of candidate keyword sequences comprises:

splicing the candidate keywords contained in the first number of candidate keyword sequences according to the word sequence of each candidate keyword in each candidate keyword sequence;

acquiring keyword sequences corresponding to a plurality of candidate access objects respectively, wherein the candidate access objects at least comprise one candidate keyword in any candidate keyword sequence;

performing text similarity calculation on the candidate keyword sequence obtained by splicing and the keyword sequence corresponding to each candidate access object;

and screening the recommended objects of the user from the plurality of candidate access objects based on the text similarity calculation result, wherein the probability that the candidate access objects with higher similarity are screened as the recommended objects is higher.

9. The method according to claim 1 or 2, characterized in that the method further comprises:

preliminarily screening the recommended objects of the users according to a specific rule to obtain preliminarily selected recommended objects;

acquiring a coding vector corresponding to each initially selected recommended object and a coding vector of a current access object;

carrying out similarity calculation on the coding vector corresponding to each initially selected recommended object and the coding vector of the current access object;

selecting a preset number of primarily selected recommendation objects with highest similarity as target recommendation objects;

and sending the target recommendation object to a client of the user for displaying.

10. An object recommendation device, the device comprising:

the first similarity calculation module is used for calculating the similarity of the coding vector and each candidate word vector;

the recommended object selection module is used for obtaining a recommended object of the user based on a similarity calculation result;

the recommended object selection module comprises: a candidate keyword acquisition unit, a keyword cluster generation unit, a code calculation unit, a similarity calculation termination unit, a keyword sequence generation unit, and a recommended object acquisition unit;

the candidate keyword acquisition unit is used for acquiring candidate keywords corresponding to a first number of candidate word vectors with the highest similarity;

the keyword cluster generating unit is used for forming a candidate keyword cluster by the acquired first number of candidate keywords;

the coding calculation unit is used for continuously performing coding calculation by utilizing a candidate keyword contained in the candidate keyword cluster and the hidden layer output at the previous moment in the recommended object prediction model;

the similarity calculation termination unit is used for taking the obtained new coding vector as a coding vector of a user access object, and executing the step of calculating the similarity between the coding vector and each candidate word vector until the similarity calculation times reach a second number or a formed candidate keyword cluster meets a preset condition;

the keyword sequence generating unit is used for generating a first number of candidate keyword sequences by using the same dimension candidate keywords in the formed second number of candidate keyword clusters;

and the recommended object obtaining unit is used for obtaining the recommended object of the user based on the candidate keywords contained in the first number of candidate keyword sequences.

11. The apparatus of claim 10, wherein the sequence acquisition module comprises:

a first data acquisition unit for acquiring a plurality of pieces of historical access data of a user, the historical access data being generated based on an access operation of the user to an output object of an application platform;

and the first sequence forming unit is used for forming an access object sequence by the object identifications respectively contained in the plurality of pieces of historical access data.

12. The apparatus of claim 10, wherein the sequence acquisition module comprises:

a second data acquisition unit for acquiring a plurality of pieces of historical access data of the user;

a keyword cluster obtaining unit, configured to obtain a keyword cluster corresponding to a corresponding access object by using each historical access data;

and a second sequence forming unit for forming a keyword sequence from the keyword cluster corresponding to each access object.

13. A storage medium having stored thereon a computer program for execution by a processor for performing the steps of the object method of any one of claims 1-9.

14. A computer device, characterized in that the computer device comprises:

a communication interface;

a memory for storing a computer program for implementing the object method of any one of claims 1-9;

using a candidate keyword contained in the candidate keyword cluster and the previous hidden layer output in the recommended object prediction model as the current hidden layer input in the recommended object prediction model, and continuing coding calculation;

taking the obtained new coding vector as a coding vector of a user access object, and executing the step of calculating the similarity of the coding vector and each candidate word vector until the similarity calculation times reach a second number or a formed candidate keyword cluster meets a preset condition;