CN111931075A

CN111931075A - Content recommendation method and device, computer equipment and storage medium

Info

Publication number: CN111931075A
Application number: CN202011114847.4A
Authority: CN
Inventors: 钟子宏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-10-19
Filing date: 2020-10-19
Publication date: 2020-11-13
Anticipated expiration: 2040-10-19
Also published as: CN111931075B

Abstract

The application provides a content recommendation method and device, computer equipment and a storage medium, and belongs to the technical field of big data. The method comprises the following steps: predicting first user characteristics of a plurality of users according to a first model to obtain first prediction information; determining a first scoring matrix according to the first prediction information, wherein one element in the first scoring matrix represents the grade of a user scoring a content to be recommended; in response to the first scoring matrix having missing data, populating the first scoring matrix based on the second model; and recommending at least one content to be recommended to the plurality of users respectively according to the filled first scoring matrix. According to the technical scheme, the content which is not contacted by the user can have the grade to which the score belongs, so that the contacted or not contacted content can be recommended to the user based on the filled score matrix, effective recommendation of the content is achieved, and the recommendation accuracy is improved.

Description

Content recommendation method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of big data technologies, and in particular, to a content recommendation method and apparatus, a computer device, and a storage medium.

Background

With the development of internet technology, users can enjoy various services through the internet. Accordingly, the service provider can recommend contents of interest to the user, such as commodities, games, gift certificates, and the like, through the recommendation system. Therefore, how to accurately recommend the user is a problem to be solved.

At present, when content recommendation is performed on a user, multi-layer training can be performed on user characteristics through a deep learning-based recommendation algorithm to obtain embedded characteristics, a recommendation model is obtained based on the embedded characteristic training, and the content recommendation is performed on the user through the recommendation model. The recommendation effect of the recommendation algorithm based on deep learning is greatly improved, and meanwhile, the recommendation algorithm has obvious advantages in processing large-scale features.

However, the recommendation algorithm based on deep learning focuses on how to better process the feature information, and the recommendation algorithm cannot effectively recommend new content, that is, content that a user has not contacted, resulting in a low recommendation accuracy.

Disclosure of Invention

The embodiment of the application provides a content recommendation method, a content recommendation device, computer equipment and a storage medium, which can enable content which is not contacted by a user to have a grade to which a score belongs, so that the contacted or not contacted content can be recommended to the user based on a filled score matrix, effective recommendation of the content is realized, and the recommendation accuracy is improved. The technical scheme is as follows.

In one aspect, a content recommendation method is provided, and the method includes:

predicting first user characteristics of a plurality of users according to a first model to obtain first prediction information, wherein the first prediction information is used for indicating the probability that the plurality of users are interested in the content to be recommended;

determining a first scoring matrix according to the first prediction information, wherein one element in the first scoring matrix represents the grade of a user scoring a content to be recommended;

in response to the first scoring matrix having missing data, populating the first scoring matrix based on a second model;

and recommending at least one content to be recommended to the users respectively according to the filled first scoring matrix.

In another aspect, a content recommendation apparatus is provided, the apparatus including:

the prediction module is used for predicting first user characteristics of a plurality of users according to a first model to obtain first prediction information, and the first prediction information is used for indicating the probability that the plurality of users are interested in the content to be recommended;

the determining module is used for determining a first scoring matrix according to the first prediction information, wherein one element in the first scoring matrix represents the grade of the score of a user on a content to be recommended;

a filling module, configured to fill the first scoring matrix based on a second model in response to the first scoring matrix having missing data;

and the content recommending module is used for recommending at least one content to be recommended to the users according to the filled first scoring matrix.

In an optional implementation manner, the determining module is configured to convert a plurality of probabilities in the first prediction information into a plurality of scores, where the plurality of scores are positive integers; determining a plurality of scoring intervals according to the plurality of scores, wherein one scoring interval corresponds to one grade; and determining the first scoring matrix according to the grades of the scores.

In an optional implementation manner, the populating module is configured to perform matrix decomposition on the first scoring matrix based on the second model in response to the first scoring matrix having missing data, so as to obtain a user scoring matrix for representing a user and a content scoring matrix for representing content; and acquiring an inner product of the user scoring matrix and the content scoring matrix, and taking the inner product as the filled first scoring matrix.

In an optional implementation manner, the content recommendation module is configured to, for any user, obtain, according to the filled first scoring matrix, a rating to which a plurality of scores corresponding to the user belong; ranking according to the grades of the plurality of scores to determine at least one score;

and recommending the content to be recommended corresponding to the at least one score to the user.

In an alternative implementation, the training step of the first model includes:

training the first original model based on the training samples;

predicting a test sample based on the trained model to obtain second prediction information, wherein the test sample comprises second user characteristics of a plurality of sample users and user labels to which the test samples belong, and the second prediction information is used for indicating the probability of the plurality of sample users interested in sample content;

and determining the trained model as the first model in response to the second prediction information meeting the evaluation index.

In an optional implementation, the apparatus further includes:

the data acquisition module is used for acquiring a sample data set, wherein the sample data set comprises user characteristics, content characteristics and a user tag to which a sample user belongs;

and the data dividing module is used for dividing the sample data set into the training samples and the test samples according to a target proportion.

In an alternative implementation, the training step of the second model includes:

acquiring third prediction information obtained by predicting the training sample based on the first model;

determining a second scoring matrix according to the third prediction information, wherein one element in the second scoring matrix represents the grade of the scoring of one sample content by one sample user;

filling the second scoring matrix according to a collaborative filtering algorithm;

and training to obtain the second model according to the error before and after the second scoring matrix is filled as a loss function.

In another aspect, a computer device is provided, and the computer device includes a processor and a memory, where the memory is used to store at least one program code, and the at least one program code is loaded and executed by the processor to implement the operations performed in the content recommendation method in the embodiments of the present application.

In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the at least one program code is loaded and executed by a processor to implement the operations performed in the content recommendation method in the embodiments of the present application.

In another aspect, a computer program product or a computer program is provided, the computer program product or the computer program comprising computer program code, the computer program code being stored in a computer readable storage medium. The processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, causing the computer device to perform the content recommendation method provided in the above-described aspects or various alternative implementations of the aspects.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

the embodiment of the application provides a content recommendation method, a first model is used for predicting based on user characteristics to preliminarily determine the probability that a user is interested in contents to be recommended, then a scoring matrix is constructed based on the probability, so that the grade of each user for scoring each content can be determined based on the scoring matrix, and as the contents which are not contacted by the user are missing data in the scoring matrix, the scoring matrix is filled through a second model, so that the contents which are not contacted by the user also have the grade of the scoring, and based on the filled scoring matrix, the contacted or not contacted contents can be recommended to the user, effective recommendation of the contents is realized, and the recommendation accuracy is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of an implementation environment of a content recommendation method provided according to an embodiment of the present application;

FIG. 2 is a flow chart of a content recommendation method provided according to an embodiment of the application;

FIG. 3 is a flow chart of another content recommendation method provided according to an embodiment of the application;

fig. 4 is a schematic architecture diagram of a content recommendation method according to an embodiment of the present application;

FIG. 5 is a flow chart of another content recommendation method provided according to an embodiment of the application;

fig. 6 is a block diagram of a content recommendation device provided according to an embodiment of the present application;

fig. 7 is a block diagram of a terminal according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a server provided according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.

The following briefly describes possible techniques that may be used in embodiments of the present application.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system. According to the embodiment of the application, the user characteristics of the user and the content characteristics of the content to be recommended can be obtained based on the big data.

Singular Value Decomposition (SVD) is an important matrix Decomposition in linear algebra, and Singular Value Decomposition is the generalization of eigen Decomposition on arbitrary matrices. The method has important application in the fields of signal processing, statistics and the like.

Pca (principal components analysis), a principal component analysis technique, also called principal component analysis technique, aims to convert multiple indexes into a few comprehensive indexes by using the idea of dimension reduction.

One-Hot encoding, also known as One-bit-efficient encoding, mainly uses an N-bit state register to encode N states, each state being represented by its own independent register bit and having only One bit active at any time. One-Hot encoding is the representation of classification variables as binary vectors. This first requires mapping the classification values to integer values. Each integer value is then represented as a binary vector, which is a zero value, except for the index of the integer, which is marked as 1.

CTR (Click-Through-Rate), which is a term commonly used for internet advertisements, refers to the Click Through Rate of a web advertisement (picture advertisement/text advertisement/keyword advertisement/ranked advertisement/video advertisement, etc.), i.e., the actual number of clicks of the advertisement (strictly speaking, the number of pages to reach the target page) divided by the advertisement presentation amount (Show content).

Arpu (average Revenue Per user) generally refers to average Revenue Per user. The operator measures an indicator of the revenue it takes from each end user. But does not reflect the final profit margin.

AUC (area Under curve) is defined as the area enclosed by the coordinate axes Under the ROC curve, and it is obvious that the value of this area is not larger than 1. Since the ROC curve is generally located above the line y = x, the AUC ranges between 0.5 and 1. The closer the AUC is to 1.0, the higher the authenticity of the detection method is, and when the AUC is equal to 0.5, the lowest authenticity is obtained, thus having no application value.

Hereinafter, an implementation environment of the content recommendation method provided in the embodiment of the present application is described. Fig. 1 is a schematic diagram of an implementation environment of a content recommendation method according to an embodiment of the present application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102.

The terminal 101 and the server 102 can be directly or indirectly connected through wired or wireless communication, and the application is not limited herein.

Optionally, the terminal 101 is a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, or the like, but is not limited thereto. The terminal 101 is installed and operated with an application program supporting content recommendation. The application may be any one of a shopping application, an information application, a social communication application, a game application, and an application marketplace application. Illustratively, the terminal 101 is a terminal used by a user, and the terminal is logged in with a user account of the user.

Optionally, the server 102 is an independent physical server, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like. The server 102 is used for providing background services for the application programs supporting content recommendation.

Optionally, in the process of implementing content recommendation, the server 102 undertakes a primary content recommendation work, and the terminal 101 undertakes a secondary content recommendation work; or, the server 102 undertakes the secondary content recommendation work, and the terminal 101 undertakes the primary content recommendation work; alternatively, a distributed computing architecture is adopted between the server 102 and the terminal 101 for collaborative content recommendation.

Optionally, the server 102 includes: the system comprises an access server, a content recommendation server and a database. The access server is used for providing access service of the terminal. The content recommendation server is used for providing background services of the application program. The content recommendation server may be one or more. When the content recommendation servers are multiple, at least two content recommendation servers exist for providing different services, and/or at least two content recommendation servers exist for providing the same service, for example, providing the same service in a load balancing manner, which is not limited in the embodiment of the present application. The content recommendation server can be provided with a prediction model for predicting the probability of a user being interested in the content to be recommended and a collaborative filtering model for populating a scoring matrix with missing data. The database is used for storing data such as user characteristics, content characteristics, user tags, prediction models and collaborative filtering models.

Optionally, the terminal 101 generally refers to one of a plurality of terminals, and this embodiment is only illustrated by the terminal 101. Those skilled in the art will appreciate that the number of terminals 101 can be greater. For example, the number of the terminals 101 is dozens or hundreds, or more, and the environment for implementing the content recommendation method may include other terminals. The number of terminals and the type of the device are not limited in the embodiments of the present application.

Optionally, the wireless network or wired network described above uses standard communication techniques and/or protocols. The Network is typically the Internet, but can be any Network including, but not limited to, a Local Area Network (LAN), a Metropolitan Area Network (MAN), a Wide Area Network (WAN), a mobile, wireline or wireless Network, a private Network, or any combination of virtual private networks. In some embodiments, data exchanged over a network is represented using techniques and/or formats including Hypertext Mark-up Language (HTML), Extensible Markup Language (XML), and the like. All or some of the links can also be encrypted using conventional encryption techniques such as Secure Socket Layer (SSL), Transport Layer Security (TLS), Virtual Private Network (VPN), Internet Protocol Security (IPsec). In other embodiments, custom and/or dedicated data communication techniques can also be used in place of or in addition to the data communication techniques described above.

Fig. 2 is a flowchart of a content recommendation method according to an embodiment of the present application, and as shown in fig. 2, the content recommendation method is described in the embodiment of the present application by taking an application to a server as an example. The content recommendation method includes the following steps.

201. The server predicts the first user characteristics of the users according to the first model to obtain first prediction information, wherein the first prediction information is used for indicating the probability that the users are interested in the content to be recommended.

In the embodiment of the application, the first model can be used for predicting the probability that a user is interested in the content to be recommended, and the probability that the content to be recommended corresponds to is larger, which indicates that the probability that the user is more interested in the content to be recommended is higher. The first user characteristic is a characteristic of a plurality of users in a first time period, and the first time period is not particularly limited in the embodiment of the present application. The first model is obtained by the server according to the second user characteristics of the plurality of users in a second time period, wherein the second time period is a time period before the first time period.

202. The server determines a first scoring matrix according to the first prediction information, wherein one element in the first scoring matrix represents the grade of the scoring of the content to be recommended by a user.

In the embodiment of the application, after obtaining the first prediction information, the server can convert the probability of each user being interested in each content to be recommended, convert the score of each user for each content to be recommended, the score being a positive integer, then the server can grade the score, thereby determining the grade to which the score of each user for each content to be recommended belongs, and finally the server can construct a first score matrix based on the grade.

203. In response to the first scoring matrix having missing data, the server populates the first scoring matrix based on a second model.

In the embodiment of the application, for any content to be recommended, if the user never contacts the content to be recommended, such as a newly released product, a newly online game, a newly released member service, and the like, the first model cannot predict the probability that the user is interested in the content to be recommended, that is, the probability that the user is interested in the content to be recommended is missing, so that the grade to which the score of the user on the content to be recommended belongs in the first scoring matrix is also missing. The server can perform collaborative filtering on the first scoring matrix based on the second model, so that the first scoring matrix is filled.

204. And the server recommends at least one content to be recommended to the users respectively according to the filled first scoring matrix.

In the embodiment of the application, after the filled first scoring matrix is obtained, the server can perform descending order according to the grades described by the scoring, and then select the first N contents to be recommended to perform personalized recommendation to the user.

Fig. 3 is a flowchart of another content recommendation method provided in the embodiment of the present application, and as shown in fig. 3, the method is described in the embodiment of the present application by taking an application to a server as an example. The content recommendation method includes the following steps.

301. The server trains a first model which is used for predicting the probability of the user interested in the content to be recommended based on the user characteristics.

In the embodiment of the application, the first model is trained by the server according to the second user characteristics of the plurality of users in the second time period.

In an optional implementation manner, before training the first model, the server may obtain a sample data set, where the sample data set includes user characteristics, content characteristics, and a user tag to which a sample user belongs; the server can then divide the sample data set into training samples and test samples according to a target scale. The target ratio is represented by a, then the training sample: the test sample is = a (1-a), and the ratio between the training sample and the test sample can be 4:1, 3:2, or 1:1, etc., which is not limited in the embodiments of the present application.

For example, taking the second time period as a T period and the target ratio as 0.8 as an example, the server can construct a sample data set by using the user features, the content features and the user tags in the T period, and then randomly cut according to the target ratio, where the training sample: the test sample =4:1, that is, randomly cut the training sample and the test sample according to the ratio of 4: 1.

After obtaining the training samples and the testing samples, the server can perform model training based on the training samples and the testing samples. The training step of training to obtain the first model comprises the following steps: the server trains the first original model based on the training sample, and then predicts the test sample based on the trained model to obtain second prediction information, wherein the test sample comprises second user characteristics of a plurality of sample users and user labels to which the test sample belongs, and the second prediction information is used for indicating the probability of the plurality of sample users interested in the sample content. In response to the second prediction information meeting the evaluation index, the server can determine the trained model as the first model; and responding to the fact that the second prediction information does not meet the evaluation index, and the server repeats the step of training the first original model based on the training sample until the obtained second prediction information meets the evaluation index. The evaluation index includes a recall ratio, a precision ratio, an auc (area Under cut), and the like, which is not limited in the embodiment of the present application.

302. The server predicts the first user characteristics of the users according to the first model to obtain first prediction information, wherein the first prediction information is used for indicating the probability that the users are interested in the content to be recommended.

In the embodiment of the application, the first model can be used for predicting the probability that a user is interested in the content to be recommended, and the probability that the content to be recommended corresponds to is larger, which indicates that the probability that the user is more interested in the content to be recommended is higher. The first user characteristic is a characteristic of a plurality of users in a first time period, and the first time period is not particularly limited in the embodiment of the present application. The first time period is a time period after the second time period.

For example, taking the first time period as the period T +1 and the content to be recommended as the commodity as an example, the server acquires the first user characteristics of the plurality of users in the period T +1, inputs the first user characteristics of the plurality of users into the trained first model, and performs prediction based on the first model to obtain the probability that each user is interested in each commodity.

303. The server determines a first scoring matrix according to the first prediction information, wherein one element in the first scoring matrix represents the grade of the scoring of the content to be recommended by a user.

In an alternative implementation manner, the server can determine the grade to which the score belongs by dividing the score into score intervals. Accordingly, the server can convert the plurality of probabilities in the first prediction information into a plurality of scores, the plurality of scores being positive integers. The server can then determine a plurality of scoring intervals based on the plurality of scores, wherein one scoring interval corresponds to one level. And finally, the server can determine the first scoring matrix according to the grades of the scores. The scoring interval may be an interval obtained by dividing the scoring interval in an equidistant manner, and may also be an interval obtained by dividing the scoring interval in another manner, which is not limited in the embodiment of the present application. Alternatively, the server can convert the plurality of probabilities into a plurality of scores by multiplying the probabilities by a constant c. The constant c is 10, 100, 1000, or the like, which is not limited in the embodiments of the present application.

For example, taking the content to be recommended as an article as an example, for each probability in the first prediction information, the server multiplies the probability by a constant c, and then divides the probability by an equidistant division manner to obtain a scoring interval

Wherein, the scoring interval corresponding to the ith grade is expressed as

. And finally, the server takes the users as rows, the commodities as columns and the grades to which the scores corresponding to the probability of the users being interested in the commodities belong as elements to construct a first scoring matrix (scoring Data matrix).

It should be noted that, for any content to be recommended, if the content to be recommended is content that the user has not contacted, such as a newly released product, a newly online game, a newly released member service, and the like, the first model cannot predict the probability that the user is interested in the content to be recommended, that is, the probability that the user is interested in the content to be recommended is missing, so that in the first scoring matrix, the grade to which the score of the user on the content to be recommended belongs is also missing. And if the scoring matrix does not have missing data, respectively recommending at least one content to be recommended to the plurality of users based on the scoring matrix.

304. In response to the first scoring matrix having missing data, the server populates the first scoring matrix based on a second model.

In the embodiment of the application, for the scoring matrix with missing data, the server can perform collaborative filtering on the first scoring matrix based on the second model, so as to implement filling of the first scoring matrix.

In an optional implementation manner, the step of performing, by the server, collaborative filtering on the first scoring matrix based on the second model is: and in response to the first scoring matrix having missing data, the server can perform matrix decomposition on the first scoring matrix based on the second model to obtain a user scoring matrix for representing the user and a content scoring matrix for representing the content. The server can then obtain an inner product of the user scoring matrix and the content scoring matrix, and the inner product is used as the first populated scoring matrix.

For example, continuing to take the content to be recommended as a commodity as an example, the server can decompose the first scoring matrix into a user scoring matrix at the user side and a commodity scoring matrix at the commodity side through SVD matrix decomposition, and then perform matrix multiplication on the two matrices to obtain a complete first scoring matrix, that is, a first scoring matrix without a deletion.

In an optional implementation manner, after obtaining the first model, the server may train to obtain the second model according to third prediction information obtained by predicting the training sample by the first model. Correspondingly, the training step of the second model comprises the following steps: the server obtains third prediction information obtained by predicting the training samples based on the first model. And then the server determines a second scoring matrix according to the third prediction information, wherein one element in the second scoring matrix identifies the grade of the scoring of one sample user for one sample content. The server then populates the second scoring matrix according to a collaborative filtering algorithm. And finally, the server can train to obtain the second model according to the error before and after the second scoring matrix is filled as a loss function.

The server may further obtain the second prediction information obtained by predicting the test sample by the first model, determine a third scoring matrix based on the second prediction information, and perform test evaluation on the second model according to the third scoring matrix. In response to the second model passing the test evaluation, the server treats the second model as a final second model; if the second model fails the test evaluation, the server continues to optimize the second model.

It should be noted that, when the server trains the first model and the second model, the same data processing procedures are adopted, such as PCA decorrelation processing, one-hot feature processing, discretization feature processing, probability conversion to score, and division of score intervals, which are all required to be consistent. That is, for each probability in the second prediction information, the server multiplies the probability by a constant c, and then divides the probability by an equidistant division mode to obtain a scoring interval

Wherein, the scoring interval corresponding to the ith grade is expressed as

. And finally, the server takes the users as rows, the commodities as columns and the grades to which the scores corresponding to the probability of the users interested in the commodities belong as elements to construct a second scoring matrix (scoring Data matrix).

It should be noted that, when the server trains the model, the adopted evaluation index may be a fixed threshold, that is, the model is evaluated by adopting the fixed threshold, and the evaluation index may also be an adaptive threshold, that is, the model is evaluated by adopting the adaptive threshold, which is not limited in this embodiment of the application.

305. And the server recommends at least one content to be recommended to the users respectively according to the filled first scoring matrix.

In the embodiment of the application, after the filled first scoring matrix is obtained, the server can perform descending order according to the grades described by the scoring, and then select the first N contents to be recommended to perform personalized recommendation to the user. And a row of elements in the first scoring matrix represents grades to which scores of a plurality of contents to be recommended corresponding to one user belong.

For any user, the server can obtain the grades of the multiple scores corresponding to the user according to the filled first score matrix. The server can then rank the plurality of scores according to the rank to which the plurality of scores belong, determining at least one score. And finally, the server can recommend the content to be recommended corresponding to at least one score to the user. Through the filled first scoring matrix, the grade of the score of the content to be recommended, which is not contacted by the user, can be obtained, so that the user can be recommended correspondingly. The problem that a recommendation method based on a traditional classification algorithm cannot recommend users with missing data is solved, for example, effective recommendation cannot be performed on new commodities based on the recommendation method of the traditional classification algorithm.

In order to make the architecture of the content recommendation method provided by the embodiment of the present application clearer, referring to fig. 4, fig. 4 is a schematic diagram of an architecture of a content recommendation method provided by the embodiment of the present application. As shown in fig. 4, a first model 404 is trained by user labels 401, user features 402, and commodity features 403 for the T period; then, predicting the user characteristics 405 and the commodity characteristics 406 in the T +1 period through the first model 404 to obtain a probability 407; converting the probability 407 into a score based on the constant c, and further obtaining a first score matrix 408 containing missing data; decomposing the first scoring matrix 408 containing the missing data into a user scoring matrix 410 and a commodity scoring matrix 411 through a second model 409 based on a collaborative filtering algorithm; and obtaining a filled first scoring matrix 412 based on the user scoring matrix 410 and the commodity scoring matrix 411.

It should be noted that, the foregoing steps 301 to 304 are optional implementation manners of the content recommendation method provided in the embodiment of the present application, and accordingly, the content recommendation method provided in the embodiment of the present application can also be implemented in other manners. The following description will take a game download recommendation scenario as an example. Referring to fig. 5, fig. 5 is a flowchart illustrating another content recommendation method according to an embodiment of the present application. Firstly, in step 501, when a personalized recommendation of a game is performed, in a data preparation stage, 2000 features including user basic data (gender, age, etc.), user login class data (login duration, number of times, days, etc.), recharge class data (amount, number of times, ARPU, etc.), game performance, game ctr, game rate, game category, etc. are prepared, a download label of the game (the download label is a positive sample, and is recorded as 1, and otherwise, is a negative sample, and is recorded as 0), and data such as a scoring matrix of the user for an existing game are used as data sources. Step 502, in a model training stage, dividing the model training stage into two training stages, namely a first training stage, according to the characteristics of a user in a period T and label data (a download label is a positive sample and is marked as 1, otherwise, the user is a negative sample and is marked as 0) of a game downloaded by the user, step 503, constructing a training sample and a test sample in the period T, step 504, obtaining a first model according to the training sample, step 505, testing and evaluating the first model according to the test sample, and if the first model passes the testing and evaluating, entering a second training stage, and training a second model; if the test evaluation is not passed, the first training phase is continued. Step 506, after the probability of the user downloading the game obtained based on the training sample in the first training stage is converted into the score, step 507, a second scoring matrix with data missing is generated, and step 508, a second model is trained based on the second scoring matrix with data missing. 509, after the probability of the user downloading the game, which is obtained based on the test sample in the first training stage, is converted into a score, 510, a third score matrix with data missing is generated, 511, the second model is tested and evaluated according to the third score matrix, and the second model is used as a final second model after the second model passes the test evaluation; if the second model fails the test evaluation, then the second model continues to be optimized. And step 512, when the second model passes the evaluation, predicting the score of the probability of the game downloaded by the user in the period T +1 based on the first model by using the user characteristics in the period T +1, step 513, generating a first scoring matrix based on the score of the probability, and step 514, filling the first scoring matrix with the missing data through the second model to obtain the filled first scoring matrix. And step 515, finally, performing reverse sorting according to the grade sequence of the scores, and selecting the top N games for personalized recommendation.

It should be noted that, in the game downloading scenario, the results of recommendation based on the traditional classification algorithm, recommendation based on the deep learning algorithm, recommendation based on the collaborative filtering algorithm, and recommendation based on the method provided by the present application are shown in table 1.

TABLE 1

	Traditional classification algorithm	Deep learning algorithm	Collaborative filtering algorithm	The method of the present application
					Recall ratio of	76.87%	82.61%	63.68%	87.91%
Precision ratio	71.12%	85.32%	69.11%	83.56%
					AUC	0.7214	0.8641	0.6933	0.8112

Fig. 6 is a block diagram of a content recommendation device according to an embodiment of the present application. The apparatus is configured to perform the steps when the content recommendation method is executed, and referring to fig. 6, the apparatus includes: a prediction module 601, a determination module 602, a population module 603, and a content recommendation module 604.

training the first original model based on the training samples;

In an optional implementation, the apparatus further includes:

It should be noted that: in the content recommendation device provided in the above embodiment, only the division of the functional modules is illustrated when content recommendation is performed, and in practical applications, the function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the content recommendation device and the content recommendation method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

In this embodiment of the present application, the computer device can be configured as a terminal or a server, when the computer device is configured as a terminal, the terminal can be used as an execution subject to implement the technical solution provided in the embodiment of the present application, when the computer device is configured as a server, the server can be used as an execution subject to implement the technical solution provided in the embodiment of the present application, or the technical solution provided in the present application can be implemented through interaction between the terminal and the server, which is not limited in this embodiment of the present application.

Fig. 7 is a block diagram of a terminal 700 according to an embodiment of the present application. The terminal 700 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.

In general, terminal 700 includes: a processor 701 and a memory 702.

The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 702 is used to store at least one program code for execution by the processor 701 to implement the content recommendation method provided by the method embodiments herein.

In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display screen 705, a camera assembly 706, an audio circuit 707, a positioning component 708, and a power source 709.

The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, disposed on a front panel of the terminal 700; in other embodiments, the display 705 can be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.

The camera assembly 706 is used to capture images or video. Optionally, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones may be provided at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 707 may also include a headphone jack.

The positioning component 708 is used to locate the current geographic Location of the terminal 700 for navigation or LBS (Location Based Service). The Positioning component 708 can be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.

Power supply 709 is provided to supply power to various components of terminal 700. The power source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.

The acceleration sensor 711 can detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 712 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the terminal 700 by the user. From the data collected by the gyro sensor 712, the processor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 713 may be disposed on a side frame of terminal 700 and/or underneath display 705. When the pressure sensor 713 is disposed on a side frame of the terminal 700, a user's grip signal on the terminal 700 may be detected, and the processor 701 performs right-left hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the terminal 700. When a physical button or a vendor Logo is provided on the terminal 700, the fingerprint sensor 714 may be integrated with the physical button or the vendor Logo.

The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the display screen 705 is increased; when the ambient light intensity is low, the display brightness of the display screen 705 is adjusted down. In another embodiment, processor 701 may also dynamically adjust the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.

A proximity sensor 716, also referred to as a distance sensor, is typically disposed on a front panel of the terminal 700. The proximity sensor 716 is used to collect the distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the dark screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 is gradually increased, the processor 701 controls the display 705 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 7 is not intended to be limiting of terminal 700 and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components may be used.

Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may generate a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, where the memory 802 stores at least one program code, and the at least one program code is loaded and executed by the processors 801 to implement the content recommendation method provided by the method embodiments. Of course, the server may also have components such as a wired or wireless network interface, a keyboard, and an input/output interface, so as to perform input/output, and the server may also include other components for implementing the functions of the device, which are not described herein again.

The embodiment of the present application further provides a computer-readable storage medium, which is applied to a computer device, and at least one program code is stored in the computer-readable storage medium, and is loaded and executed by a processor to implement the operations performed by the computer device in the content recommendation method of the above embodiment.

Embodiments of the present application also provide a computer program product or a computer program comprising computer program code stored in a computer readable storage medium. The processor of the computer device reads the computer program code from the computer-readable storage medium, and the processor executes the computer program code, so that the computer device performs the content recommendation method provided in the above-described various alternative implementations.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for recommending content, the method comprising:

2. The method of claim 1, wherein determining a first scoring matrix based on the first predictive information comprises:

converting the plurality of probabilities in the first prediction information into a plurality of scores, the plurality of scores being positive integers;

determining a plurality of scoring intervals according to the plurality of scores, wherein one scoring interval corresponds to one grade;

and determining the first scoring matrix according to the grades of the scores.

3. The method of claim 1, wherein the populating the first scoring matrix based on a second model in response to the first scoring matrix having missing data comprises:

performing matrix decomposition on the first scoring matrix based on the second model in response to the first scoring matrix having missing data, so as to obtain a user scoring matrix for representing a user and a content scoring matrix for representing content;

and acquiring an inner product of the user scoring matrix and the content scoring matrix, and taking the inner product as the filled first scoring matrix.

4. The method according to claim 1, wherein the recommending at least one content to be recommended to the users according to the populated first scoring matrix comprises:

for any user, obtaining grades to which a plurality of scores corresponding to the user belong according to the filled first score matrix;

ranking according to the grades of the plurality of scores to determine at least one score;

5. The method according to any of claims 1 to 4, wherein the training step of the first model comprises:

training the first original model based on the training samples;

6. The method of claim 5, wherein prior to training the first raw model based on the training samples, the method further comprises:

acquiring a sample data set, wherein the sample data set comprises user characteristics, content characteristics and a user tag to which a sample user belongs;

and dividing the sample data set into the training samples and the test samples according to a target proportion.

7. The method of claim 5, wherein the step of training the second model comprises:

8. A content recommendation apparatus, characterized in that the apparatus comprises:

9. A computer device, characterized in that the computer device comprises a processor and a memory for storing at least one piece of program code, which is loaded by the processor and which performs the content recommendation method according to any one of claims 1 to 7.

10. A storage medium for storing at least one program code for performing the content recommendation method of any one of claims 1 to 7.