CN110162714B

CN110162714B - Content pushing method, device, computing equipment and computer readable storage medium

Info

Publication number: CN110162714B
Application number: CN201910092267.0A
Authority: CN
Inventors: 叶娃; 邹晓东; 肖峥; 刘智鹏; 黎国鹏
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-01-30
Filing date: 2019-01-30
Publication date: 2023-11-14
Anticipated expiration: 2039-01-30
Also published as: CN110162714A

Abstract

The invention relates to a content pushing method, a content pushing apparatus, a computing device and a computer readable storage medium. The content pushing method comprises the following steps: acquiring personalized data and social relationship data of potential users; constructing input data associated with the potential user based on the acquired personalized data and social relationship data; inputting the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the potential user's preferences match target content based on the input data; and selectively initiating pushing of the target content to clients of the potential users depending on output data generated by the at least one machine-learned classifier. The method can improve the matching degree between the push content and the audience preference, so that the content delivery is more targeted and more efficient.

Description

Content pushing method, device, computing equipment and computer readable storage medium

Technical Field

The present invention relates to the field of machine learning technology, and in particular, to a content pushing method, a content pushing apparatus, a computing device, and a computer readable storage medium.

Background

Content pushing based on user attributes has found widespread use in the internet industry. For example, some websites provide a service called "guessing you like" in which after a user clicks on a link to an item of content (e.g., news, video, merchandise) on a page of an application ("APP") or browser, links to other content similar to the item of content are presented in addition to the page of the item of content being presented.

Such services are typically based on machine learning techniques, in which a computer learns the user's preferences from big data and predicts possible results. However, the existing content pushing method has problems in terms of accuracy and effectiveness, so that content pushed to a user may not be liked by the user. This results in lower information delivery efficiency.

Disclosure of Invention

It would be advantageous to provide a mechanism that can increase the recognition rate of the user's preferences for the target content and thus increase the pertinence and accuracy of content pushing.

According to an aspect of the present invention, there is provided a content pushing method, including: acquiring personalized data and social relationship data of potential users; constructing input data associated with the potential user based on the acquired personalized data and social relationship data; inputting the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the potential user's preferences match target content based on the input data; and selectively initiating pushing of the target content to clients of the potential users depending on output data generated by the at least one machine-learned classifier.

In some embodiments, the selectively initiating pushing the targeted content to the client of the potential user comprises: pushing the target content to a client of the potential user in response to the output data generated by each of the at least one machine-learned classifier indicating that the potential user's preferences match the target content; and not pushing the target content to the client of the potential user in response to the output data generated by any of the at least one machine-learned classifier indicating that the preference of the potential user does not match the target content.

In some embodiments, each of the at least one machine-learned classifier is trained by: acquiring personalized data and social relationship data of each of a plurality of known users; constructing respective input data respectively associated with the plurality of known users based on the acquired respective personalized data and social relationship data; providing respective target output data associated with the plurality of known users, wherein the target output data associated with each known user indicates whether the preferences of that known user match the target content; and inputting the respective input data and the respective target output data into each machine learning classifier for training.

In some embodiments, the building respective input data associated with the plurality of known users, respectively, comprises: for each known user: generating a first matrix based on the personalized data of the known user; generating a second matrix based on the personalized data and social relationship data of the known user, wherein the second matrix describes respective similarities between a plurality of friends of the known user and the known user in terms of personal attributes and respective similarities between the plurality of friends and the known user in terms of personal preferences for n pieces of content, n being a natural number, and wherein the n pieces of content are different from the target content; physically combining the first matrix and the second matrix to form a third matrix; and reducing the dimensions of the third matrix to obtain a fourth matrix as input data associated with the known user.

In some embodiments, the personalized data for each known user includes: m attribute values describing respective ones of the m attributes of the known user; and n preference values indicating whether the preference of the known user matches the corresponding content of the n contents, m being a natural number. The social relationship data for each known user includes personalized data for each of a plurality of friends of the known user, the personalized data for each friend including: m attribute values describing a corresponding attribute of the m attributes of the friend; and n preference values indicating whether the friend's preferences match corresponding ones of the n content.

In some embodiments, the generating the first matrix based on the personalized data for the known user comprises: and vectorizing the m attribute numerical values of the personalized data of the known user into a row vector serving as the first matrix.

In some embodiments, the generating the second matrix based on the personalized data and the social relationship data of the known user comprises: deriving respective similarities in personal attributes between the plurality of friends and the known user from respective m attribute values of the personalized data of the plurality of friends and the m attribute values of the personalized data of the known user; deriving respective similarities between the plurality of friends and the known user in terms of personal preferences for each of the n content from the respective n preference values of the personalized data of the plurality of friends and the n preference values of the personalized data of the known user; vectorizing the derived similarity between the plurality of friends and the known user in terms of personal attributes to a first column vector; vectorizing the derived respective similarity between the plurality of friends and the known user in terms of personal preferences for each of the n content into a respective second column vector; and concatenating the first column vector and each of the second column vectors in a row direction to form the second matrix.

In some embodiments, the deriving respective similarities in personal attributes between the plurality of friends and the known user comprises: comparing the m attribute values of the personalized data of each friend with corresponding ones of the m attribute values of the personalized data of the known user; counting the number of comparison results indicating equality; and determining the ratio of the number to m as a similarity between the friend and the known user in terms of personal attributes.

In some embodiments, deriving respective similarities between the plurality of friends and the known user in terms of personal preferences for each of the n content comprises: comparing the n preference values of the personalized data of each friend with corresponding ones of the n preference values of the personalized data of the known user; in response to the comparison indicating that the friend and the known user have the same preference for the same content of the n content, setting a similarity between the friend and the known user in terms of personal preferences for the content to a predetermined value; and in response to the comparison indicating that the friend and the known user have different preferences for the same content of the n content, setting a similarity between the friend and the known user in terms of personal preferences for the content to zero.

In some embodiments, the physically combining the first matrix and the second matrix comprises: the first matrix and the second matrix are concatenated in a row direction to form the third matrix.

In some embodiments, the reducing the dimension of the third matrix comprises: and carrying out principal component analysis on the third matrix.

In some embodiments, the target output data for each known user includes a numerical value indicating whether the preference of the known user matches the target content.

In some embodiments, the building input data associated with the potential user comprises: generating a first matrix based on the personalized data of the potential user; generating a second matrix based on the personalized data and social relationship data of the potential user, wherein the second matrix describes respective similarities between a plurality of friends of the potential user and the potential user in terms of personal attributes and respective similarities between the plurality of friends and the potential user in terms of personal preferences for n pieces of content, n being a natural number, and wherein the n pieces of content are different from the target content; physically combining the first matrix and the second matrix to form a third matrix; and reducing the dimension of the third matrix to obtain a fourth matrix as input data associated with the potential user.

In some embodiments, the personalized data for the potential user includes: m attribute values describing respective ones of the m attributes of the potential user; and n preference values indicating whether the preference of the potential user matches the corresponding content of the n contents, m being a natural number. The social relationship data of the potential user includes personalized data for each of a plurality of friends of the potential user, the personalized data for each friend including: m attribute values describing a corresponding attribute of the m attributes of the friend; and n preference values indicating whether the friend's preferences match corresponding ones of the n content.

In some embodiments, the generating the first matrix based on the personalized data for the potential user comprises: and vectorizing the m attribute numerical values of the personalized data of the potential user into a row vector serving as the first matrix.

In some embodiments, the generating the second matrix based on the personalized data and the social relationship data of the potential user comprises: deriving respective similarities in personal attributes between the plurality of friends and the potential user from respective m attribute values of the personalized data of the plurality of friends and m attribute values of the personalized data of the potential user; deriving respective similarities between the plurality of friends and the potential user in terms of personal preferences for each of the n content from the respective n preference values of the personalized data of the plurality of friends and the n preference values of the personalized data of the potential user; vectorizing the derived similarity between the plurality of friends and the potential user in terms of personal attributes to a first column vector; vectorizing the derived respective similarity between the plurality of friends and the potential user in terms of personal preferences for each of the n content into a respective second column vector; and concatenating the first column vector and each of the second column vectors in a row direction to form the second matrix.

In some embodiments, the deriving respective similarities in personal attributes between the plurality of friends and the potential user comprises: comparing the m attribute values of the personalized data of each friend with corresponding values of the m attribute values of the personalized data of the potential user; counting the number of comparison results indicating equality; and determining the ratio of the number to m as a similarity in personal attribute between the friend and the potential user.

In some embodiments, the deriving respective similarities between the plurality of friends and the potential user in terms of personal preferences for each of the n content comprises: comparing the n preference values of the personalized data of each friend with corresponding ones of the n preference values of the personalized data of the potential user; in response to the comparison indicating that the friend and the potential user have the same preference for the same content of the n content, setting a similarity between the friend and the potential user in terms of personal preferences for the content to a predetermined value; and in response to the comparison indicating that the friend and the potential user have different preferences for the same content of the n content, setting a similarity between the friend and the potential user in terms of personal preferences for the content to zero.

According to another aspect of the present invention, there is provided a content pushing apparatus including: means for obtaining personalized data and social relationship data for potential users; means for constructing input data associated with the potential user based on the acquired personalized data and social relationship data; means for inputting the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the preferences of the potential user match target content based on the input data; and means for selectively initiating pushing of the target content to the client of the potential user depending on output data generated by the at least one machine-learned classifier.

According to another aspect of the present invention, there is provided a content pushing apparatus including: the acquisition module is configured to acquire personalized data and social relationship data of potential users; a building module configured to build input data associated with the potential user based on the acquired personalized data and social relationship data; an input module configured to input the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the preferences of the potential user match target content based on the input data; and a pushing module configured to selectively push the target content to a client of the potential user depending on output data generated by the at least one machine learning classifier.

According to another aspect of the present invention there is provided a computing device comprising a memory and a processor, the memory being configured to store thereon computer program instructions that, when executed on the processor, cause the processor to perform the method described in the first aspect.

According to another aspect of the present invention there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed on a processor, cause the processor to perform the method described in the first aspect.

These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.

Drawings

Further details, features and advantages of the invention are disclosed in the following description of exemplary embodiments with reference to the following drawings, in which:

FIG. 1 illustrates a flowchart of operations in a training phase for a machine learning classifier according to an embodiment of the present invention;

FIG. 2A illustrates an example of various dimensions describing basic attributes of a user according to an embodiment of the present invention;

FIG. 2B illustrates an example of various dimensions describing a user's interest attributes according to an embodiment of the present invention;

FIG. 2C shows an illustrative user interface reflecting whether a user and his friends like a certain content;

FIG. 3 illustrates an example process of constructing input data in the operations of FIG. 1;

FIG. 4 shows a schematic and exemplary illustration of a content pushing method according to an embodiment of the invention;

FIG. 5 shows a schematic and exemplary illustration of a user interface at a client of a potential user being pushed target content;

fig. 6 shows a schematic block diagram of a content pushing device according to an embodiment of the application; and is also provided with

FIG. 7 illustrates an example system including an example computing device that represents one or more systems and/or devices that can implement the various techniques described herein.

Detailed Description

The inventors of the present application have recognized that the method of constructing the input data of a machine-learned classifier has a great impact on classification accuracy in producing good results. Specifically, by introducing the attributes and preferences of friends of the user in a Social Networking Service (SNS), more dimensions can be provided to determine whether the preferences of the user match the content to be pushed, thereby improving the classification accuracy of the classifier. In addition, the voting decision is made by adopting a plurality of machine learning classifiers, so that the matching degree between the pushed content and the preference of the user can be further improved. Based on such insight, a solution is proposed, which will be described in detail below.

Before describing in detail embodiments of the present application, several terms used herein are defined.

1. A machine learning classifier. The term "machine learning classifier" refers to a machine learning model designed to solve the classification problem. Specifically, the classifier can be trained by using the training samples marked with the classes for judging the class to which a new observation sample belongs. Examples of machine learning classifiers include, but are not limited to: neural networks (e.g., convolutional neural network CNN, cyclic neural network RNN, etc.) and Support Vector Machines (SVMs).

2. Content. The term "content" may generally refer to information, such as video, audio, pictures, text, etc., distributed via a communication infrastructure, such as the internet, an intranet, a telecommunications network, or the like. The content may describe specific objects such as news events, program trailers, movie listings, merchandise introductions, promotional campaigns, coupons, and the like.

3. Target content. The term "target content" refers to content to be pushed to a user.

4. Personalizing the data. The term "personalization data" is data associated with a user to describe the personal attributes of the user and their personal preferences for one or more content.

5. Social relationship data. The term "social relationship data" is data associated with a user that includes personalized data for one or more friends of the user in a social networking service.

6. The user is known. The term "known user" refers to a user whose state of preference matches the target content is known to the machine learning classifier. The personalized data and social relationship data of the known users are the basis for constructing training data of the machine learning classifier.

7. Potential users. The term "potential user" refers to a user whose status of whether their preferences match the target content is unknown to the machine learning classifier. For machine learning classifiers, the potential user is a new observation sample.

8. And a client. The term "client" may refer to various different types of devices, such as a desktop, server, notebook or netbook computer, mobile device (e.g., tablet or phablet device, cellular or other wireless telephone (e.g., smart phone), notepad computer, mobile station), wearable device (e.g., glasses, watch), entertainment device (e.g., entertainment appliance, set-top box communicatively coupled to a display device, game console), television or other display device, automobile computer, and so forth, or an application running on the various different types of devices.

The present invention proposes to utilize a machine-learning classifier to predict whether a potential user's preferences match target content and thereby selectively push the target content to the potential user. The use of machine learning classifiers includes a training phase and an execution phase, which may be independent in time and space. The training phase involves training the classifier with the labeled class of training samples, and the performing phase involves classifying the new observation samples with the trained classifier. To better understand how the machine-learned classifier is used for content pushing in the execution phase, the training phase of the machine-learned classifier is first described below.

FIG. 1 illustrates a flow chart of a method 100 of training a machine learning classifier in accordance with an embodiment of the present invention.

At step 110, personalized data and social relationship data for each of a plurality of known users is obtained.

The personalized data for each known user may include:

-m attribute values describing respective ones of the m attributes of the known user; and

-n preference values indicating whether the preferences of the known user match the corresponding content of the n contents. m and n are natural numbers.

The m attribute values describe respective ones of the m attributes of the user. By way of example and not limitation, the m attributes may include one or more of the following:

location, e.g., city, administrative area, common geographic location over a period of time (e.g., a month), etc.;

-gender;

age, e.g. specific to how many years, age group, etc.;

handset parameters, e.g., operating system type (e.g., iOS or Android), make, model, supported network type, etc.;

wedding conditions, e.g. whether a wedding has been performed, whether a child has been carerated, etc.;

educational level, e.g. doctor, master, university, etc.;

topics of interest, such as, for example, houses, cars, education, sports, health, games, etc.

Some of the personal attributes may be collected when a user registers with a website or social APP. Alternatively or additionally, some of the personal attributes may be counted and recorded during the user's use of the browser or social APP. There are various ways to obtain these personal attributes. Fig. 2A shows an example of basic image data of a WeChat user, and fig. 2B shows an example of interest tag data of a WeChat user. In the case of WeChat, basic attributes of personal attributes, such as location, gender, cell phone parameters, wedding status, education level, etc., may be obtained from the user's base portrait data, and topics of interest in the personal attributes may be obtained from various interest tags of the user.

In some embodiments, these personal attributes may already be saved in a database, and their retrieval involves retrieving data from some storage location (e.g., a local or remote database). In the embodiments described below, the m attribute values are available as, for example, integer values. For example, in sex, male is 1 and female is 0; on the parameters of the mobile phone, android is 1, and iOS is 0; in the wedding condition, the unmarked value is 0, the married value is 1 without child, and the married value is 2 with child; at the education level, the number of the universities is 1, and the number of the universities is 0; on a topic of interest, 1 is of interest to a certain topic, 0 is of no interest, and so on.

The n preference values indicate whether the user's preference matches a corresponding content of the n contents. Similar to the personal attributes of the user, the user's personal preferences for certain content may be collected when the user registers with a website or social APP. Alternatively or additionally, it may be counted and recorded during the user's use of the browser or social APP.

Still taking WeChat as an example, FIG. 2C shows an illustrative user interface for a WeChat user (hereinafter "user A") who shares a positive rating for a song "Salin Na". This indicates that user a likes the content (i.e., song "salina").

In embodiments, the n preference values may already be available as, for example, binary values. For example, for content x (x is a natural number and x+.n), a value of "1" indicates that the user's preference matches the content x, and a value of "0" indicates that the user's preference does not match the content x.

The social relationship data for each known user may include personalized data for each of a plurality of friends of the known user. Similarly, the personalized data for each friend includes:

-m attribute values describing respective ones of the m attributes of the friend; and

-n preference values indicating whether the friend's preferences match corresponding ones of the n content.

The personalization data of the friend may be obtained in the same manner as the personalization data of the known user described above. For example, in the user interface shown in fig. 2C, friends B, C and D of user a comment on the song "salina" shared by user a, where friends B and C give positive ratings indicating that they like this content, and friend D gives negative ratings indicating that he does not like this content. Thus, the respective preference values of friends B, C and D for the content ("song" salina ") can be collected and recorded. The attribute values of friends may also be obtained in the same manner as the above-described obtaining of attribute values of known users, and will not be described in detail for brevity.

In an embodiment, the social relationship of the user may be obtained from a server of the social APP and its associated database. For example, in the case of WeChat, for each user, the server maintains a social relationship chain from which multiple friends of the user may be determined. Also for example, in microblog ^TM For each user, the server also maintains a social relationship chain that links the user, other users of interest to the user, and other users of interest to the user together. From such a chain of social relationships, friends of a particular user in a social network may be located, and in turn personalized data for those friends may be collected or retrieved from a database.

At step 120, respective input data associated with the plurality of known users, respectively, is constructed based on the acquired personalized data and social relationship data for each of the plurality of known users. Fig. 3 shows an example process of step 120. This process is performed for each of the plurality of known users.

Referring to fig. 3, at step 121, a first matrix is generated based on the personalization data of the known user. In some embodiments, m attribute values in the personalized data of the known user are vectorized into a row vector as the first matrix. For example, assuming that the above-mentioned user A has 8 attribute values in Table 1-1 as shown below, the first matrix may be expressed as . Such a first matrix characterizes user a from the personal dimension.

TABLE 1-1

Sex (sex)	Mobile phone parameters	Wedding condition	Education level	Topic 1	Topic 2	Topic 3	Topic 4
								1	0	1	1	0	1	1	0

At step 122, a second matrix is generated based on the personalized data and social relationship data of the known user, wherein the second matrix describes respective similarities in personal attributes between a plurality of friends of the known user and respective similarities in personal preferences for the n pieces of content between the plurality of friends and the known user. In an embodiment, this may be performed by the following operations (i) to (v).

(i) Deriving respective similarities in personal attributes between the plurality of friends and the known user from the respective m attribute values of the personalized data of the plurality of friends and the m attribute values of the personalized data of the known user. Taking still the above mentioned user a as an example, it is assumed that this user a has four friends B, C, D and E, which are characterized by the attributes shown in tables 1-2 below.

TABLE 1-2

	Sex (sex)	Mobile phone parameters	Wedding condition	Education level	Topic 1	Topic 2	Topic 3	Topic 4
									Friend B	0	1	1	0	0	0	0	1
Friend C	1	0	0	0	0	1	0	1
									Friend D	1	0	1	1	0	0	1	1
Friend E	1	0	1	1	0	1	1	0

In some embodiments, the respective similarity in personal attribute between friends B, C, D and E and the user a may be measured by the ratio of the number of respective identical attribute values between friends B, C, D and E and the user a to the total number of personal attribute values m (8 in this example). For example, for friend B, the 8 attribute values "01100001" for friend B are compared with the 8 attribute values "10110110" for user a, respectively. Then, the number of equal attribute values between the friend B and the user a is counted. Specifically, the number of equal attribute values between friend B and the user a is counted as 2. Then, the similarity between the friend B and the user a in terms of personal attributes is 2/8=0.25. In the same manner, the respective similarities between friends C, D and E and user a in terms of personal attributes may be measured as 0.5, 0.75 and 1, respectively.

(ii) Deriving respective similarities between the plurality of friends and the known user in terms of personal preferences for each of the n content from the respective n preference values of the personalized data of the plurality of friends and the n preference values of the personalized data of the known user. Continuing with the example of user a, assume that user a and its four friends B, C, D and E have preferences (1 indicates like and 0 indicates dislike) for n pieces of content (in this example, n=5) as shown in tables 1-3 below:

Tables 1 to 3

	Content 1	Content 2	Content 3	Content 4	Content 5
						User A	1	0	1	0	1
Friend B	1	1	0	1	0
						Friend C	1	0	1	0	1
Friend D	1	1	1	1	1
						Friend E	1	0	1	1	0

Specifically, n preference values (in this example, n=5) in the personalized data of each of the friends B, C, D and E are compared with corresponding ones of the n preference values (in this example, n=5) in the personalized data of the user a. For example, for friend B, 5 preference values "11010" indicating whether or not it likes contents 1 to 5 are compared with 5 preference values "10101" indicating whether or not user a likes contents 1 to 5, respectively.

If the comparison indicates that the friend and the user A have the same preference for the same content of the 5 contents, the similarity between the friend and the user A in terms of personal preferences for the content is set to a predetermined value, such as 1. For example, for friend B, since both the friend B and the user a have the same preference value 1 for content 1, the similarity between the friend B and the user a in terms of personal preference for content 1 may be set to 1.

If the comparison indicates that the friend and the user A have different preferences for the same content of the 5 contents, the similarity between the friend and the user A in terms of personal preferences for the content is set to zero. For example, for friend B, since friend B likes contents 2 and 4 and user a dislikes contents 2 and 4, the similarity between friend B and user a in terms of personal preference for contents 2 and 4 may both be set to 0. Also, since the friend B does not like the contents 3 and 5 and the user a does like the contents 3 and 5, the similarity between the friend B and the user a in terms of personal preference for the contents 3 and 5 can be set to 0.

In the same manner, respective similarities between friends C, D and E and user a in terms of personal preferences for content 1 to 5 can be obtained.

(iii) The derived similarity between the plurality of friends and the known user, respectively, in terms of personal attributes is vectorized to a first column vector. In the example of user a, since the respective similarities between friends B, C, D and E and user a in terms of personal attributes are 0.25, 0.5, 0.75 and 1, respectively, the following first column vector is obtained:

。

(iv) The derived respective similarity between the plurality of friends and the known user in terms of personal preferences for each of the n content is vectorized to a respective second column vector. In the example of user a, the respective similarities between friends B, C, D and E and user a in terms of personal preferences for content 1 are 1, 1 and 1, respectively; the respective similarities between friends B, C, D and E and user a in terms of personal preferences for content 2 are 0, 1, 0 and 1, respectively; the respective similarities between friends B, C, D and E and user a in terms of personal preferences for content 3 are 0, 1 and 1, respectively; the respective similarities between friends B, C, D and E and user a in terms of personal preferences for content 4 are 0, 1, 0 and 0, respectively; the respective similarities between friends B, C, D and E and user a in terms of personal preferences for content 5 are 0, 1 and 0, respectively. Thus, 5 second column vectors are obtained as follows:

、/>、/>、/>、/>。

(v) Concatenating (concatenating) the first column vector and each of the second column vectors in a row direction to form the second matrix. In the example of user a, the first column vector and the 5 second column vectors are concatenated to obtain a second matrix shown below. Such a second matrix characterizes user A from the social relationship dimension.

。

It will be appreciated that operations (i) to (v) need not be performed in the order described above. For example, operation (iii) may be performed immediately after operation (i), and operation (iv) may be performed immediately after operation (ii). For another example, operation (ii) may be performed prior to operation (i).

At step 123, the first matrix and the second matrix are physically combined to form a third matrix. The term "physical merger" is different from mathematical operation merger and is understood to mean concatenation. This preserves detailed features about similarities in personal attributes and similarities in personal preferences for content. In some embodiments, the first matrix and the second matrix are concatenated in a row direction to form the third matrix. In the example of user A, the first matrix is And a second matrix->The combination results in a third matrix as follows:

，

wherein the blank positions are filled with 0. Such a third matrix characterizes user A from both the personal dimension and the social relationship dimension.

At step 124, the third matrix is reduced in dimension to obtain a fourth matrix as input data associated with the known user.

Step 124 aims to solve the sparsity problem of the third matrix caused by filling the blank position with 0, and reduce the computational complexity. In some embodiments, the reduction in dimension may be achieved by Principal Component Analysis (PCA) of the third matrix. In other embodiments, other data dimension reduction methods may be employed. PCA transforms the original data into a group of linearly independent representations of each dimension through linear transformation, thereby realizing the extraction of the main characteristic components of the data. After the PCA dimension reduction operation, the fourth matrix has the same number of columns as the third matrix, but a reduced number of rows, e.g., 1 row or 2 rows. An example pseudocode for a PCA dimension reduction operation is shown below:

　　import numpy as np

　　from sklearn.decomposition import PCA

a=np. array ([ third matrix ])

pca=pca (n_components=1)// dimension reduction to one row

　　　　pca.fit(a)

　　　　print(pca.explained_variance_ratio_)

print (pca.expanded_variance_)// output fourth matrix

Thus, for each of a plurality of known users, a corresponding fourth matrix may be obtained as input data. Thus, respective input data associated with a plurality of known users is constructed. These input data constitute a training dataset of the machine-learned classifier.

For the purpose of training a machine learning classifier, it is also necessary to provide the classifier with target output data. The target output data is the "answer" to the classification question. In this context, the classifier is trained for target content to be pushed that is different from content 1 to n. Thus, for each known user, the target output data is the state of whether the preference of the known user matches the target content.

Referring back to FIG. 1, at step 130, respective target output data associated with the plurality of known users is provided. In some embodiments, for each of the plurality of known users, a value is provided as output data associated with the known user that indicates whether the user's preferences match the target content. For example, if the user likes the target content, the target output data is set to 1; if the user does not like the target content, the target output data is set to 0. In some embodiments, target output data indicating whether the preferences of a known user match the target content may be obtained in the same manner as described above with respect to the preference values in the personalized data.

In this way, for each known user, the corresponding training data (i.e., fourth matrix) and the corresponding target output data (i.e., answers to the classification questions) for the machine-learned classifier are obtained.

At step 140, the respective input data and the respective target output data are input to the machine learning classifier for training.

In the case of the convolutional neural network, the input data as the fourth matrix may be directly input into the convolutional neural network, because the convolutional neural network is good at directly processing the two-dimensional matrix. Convolutional neural networks are widely used for image processing, where a two-dimensional matrix of pixel data is input as an input directly into the convolutional neural network. In the case of a recurrent neural network, the data in the fourth matrix may be read row by row and the data in each row provided to a respective one of the input nodes in the input layer of the recurrent neural network. In the case of a support vector machine, the data in the fourth matrix may be read row by row and concatenated in sequence to form one input vector. The support vector machine is trained to construct a hyperplane to classify input vectors (like or dislike target content). For other machine learning models, the mapping of input data to the model may be configured as appropriate and will not be described in detail herein for brevity.

Although in the above embodiment, step 130 is described as being performed in series with steps 110 and 120, the present invention is not limited thereto. For example, step 130 may be performed prior to step 120 or in parallel with step 120.

Using the training data set and corresponding target output data constructed as described above, the machine learning classifier can be trained to mine implicit associations between the attributes and social relationships of the user and the state of whether the user's preferences match the target content. Because not only the user's own attributes and preferences are considered, but also the user's friends ' attributes and preferences, machine learning classifiers are expected to have tighter decision logic, thereby improving classification accuracy. The trained machine-learning classifier can be used to predict whether potential users (i.e., new observation samples) other than known users like the target content.

Once trained, the trained machine-learned classifier can be utilized to perform pushing of target content to potential users. In one specific example, assume that the machine learning classifier has been trained on target content to be pushed in accordance with the operation of the training phase described above, where content 1 through n described above are n coupons (coupons 1, 2, 3, …, n) of different categories of goods or services, respectively, and the target content to be pushed is a coupon of a certain newly-launched movie XYZ. A content pushing method based on at least one machine learning classifier according to an embodiment of the present invention is described below with reference to fig. 4 and 5.

Fig. 4 shows a schematic and exemplary illustration of a content pushing method 400 according to an embodiment of the invention.

At step 410, personalized data and social relationship data for a potential user (also referred to as "user a") is obtained. Step 410 corresponds to step 110 in the training phase described above.

The personalization data for potential user a also includes:

-m attribute values describing respective ones of the m attributes of the potential user a; and

-n preference values indicating whether the preferences of the potential user a match corresponding content of n contents.

Let us assume that potential user a has 8 attribute values (m=8) in table 2-1 and 5 preference values (n=5) in table 2-2.

TABLE 2-1

TABLE 2-2

	Content 1 (coupon 1)	Content 2 (coupon 2)	Content 3 (coupon 3)	Content 4 (coupon 4)	Content 5 (coupon 5)
						Potential user a	1	0	1	0	1

And, the social relationship data of potential user a also includes personalized data for each of a plurality of friends of the potential user a. The personalized data for each friend includes:

Assume that four friends b, c, d, and e of potential user a have attribute values in tables 2-3 and preference values in tables 2-4 (m=8, n=5 in this example).

Tables 2 to 3

	Sex (sex)	Mobile phone parameters	Wedding condition	Education level	Topic 1	Topic 2	Topic 3	Topic 4
									Friend b	0	1	0	1	0	1	0	1
Friend c	1	1	0	0	1	1	0	1
									Friend d	1	0	1	1	0	1	1	1
Friend e	1	0	0	1	0	1	0	0

Tables 2 to 4

	Content 1 (coupon 1)	Content 2 (coupon 2)	Content 3 (coupon 3)	Content 4 (coupon 4)	Content 5 (coupon 5)
						Friend b	1	0	0	1	0
Friend c	0	0	1	0	1
						Friend d	1	1	0	1	1
Friend e	1	1	1	1	0

At step 420, input data associated with the potential user a is constructed based on the acquired personalized data and social relationship data for the potential user a. Step 420 corresponds to step 120 in the training phase described above and is described herein with reference to step 120. Step 420 may include the following operations (1) to (4).

(1) A first matrix is generated based on the personalized data for the potential user a. According to the attribute values in Table 2-1, the first matrix associated with potential user a may be expressed as。

(2) A second matrix is generated based on the personalized data and social relationship data for the potential user a. This can be achieved by the operations (2-1) to (2-5).

(2-1) deriving respective similarities in personal attribute between the plurality of friends b, c, d and e and the potential user a from the respective m attribute values of the personalized data of the plurality of friends b, c, d and e and the m attribute values of the personalized data of the potential user a. Specifically, the attribute values of each of friends b, c, d, and e are compared with corresponding ones of the attribute values of potential user a, respectively. The number of equal attribute values between the friend and the potential user a is then counted. The ratio of the number of equal attribute values to m (m=8 in this example) is determined as the similarity between the friend and potential user a in terms of personal attributes. Thus, the respective similarities in personal attribute between friends b, c, d and e and potential user a can be measured as 0.375, 0.25, 0.875 and 0.75, respectively.

(2-2) deriving respective similarities between the plurality of friends b, c, d and e and the potential user a in terms of personal preferences for each of the n contents from the respective n preference values of the personalized data of the plurality of friends b, c, d and e and the n preference values of the personalized data of the potential user a. Specifically, 5 preference values in the personalized data for each of the friends b, c, d, and e are compared to corresponding ones of the 5 preference values in the personalized data for the potential user a. If the comparison indicates that the friend and the potential user a have the same preference for the same content of the 5 contents, the similarity between the friend and the potential user a in terms of personal preference for the content is set to a predetermined value, for example 1. If the comparison indicates that the friend and the potential user a have different preferences for the same content of the 5 content, the similarity between the friend and the potential user a in terms of personal preferences for the content is set to zero.

(2-3) vectorizing the derived respective similarities in personal attribute between the plurality of friends b, c, d and e and the potential user a into a first column vector. According to (2-1), since the respective similarities between friends b, c, d and e and potential user a in terms of personal attributes are 0.375, 0.25, 0.875 and 0.75, respectively, the following first column vectors are obtained:

。

(2-4) vectorizing the derived respective similarities between the plurality of friends b, c, d and e and the potential user a in terms of personal preferences for each of the n content into a corresponding second column vector. According to (2-2), the respective similarities between friends b, c, d and e and potential user a in terms of personal preference for content 1 are 1, 0, 1 and 1, respectively; the respective similarities between friends b, c, d and e and potential user a in terms of personal preferences for content 2 are 1, 0 and 0, respectively; the respective similarities between friends b, c, d and e and potential user a in terms of personal preferences for content 3 are 0, 1, 0 and 1, respectively; the respective similarities between friends b, c, d and e and potential user a in terms of personal preferences for content 4 are 0, 1, 0 and 0, respectively; the respective similarities between friends b, c, d and e and potential user a in terms of personal preferences for content 5 are 0, 1 and 0, respectively. Thus, 5 second column vectors are obtained as follows:

、/>、/>、/>、/>。

(2-5) concatenating the first column vector and each of the second column vectors in a row direction to form the second matrix. For potential user a, the first column vector and the 5 second column vectors are concatenated to obtain a second matrix shown below.

。

It will be appreciated that operations (2-1) through (2-5) need not be performed in the order described above. For example, operation (2-3) may be performed immediately after operation (2-1), and operation (2-4) may be performed immediately after operation (2-2). As another example, operation (2-2) may be performed prior to operation (2-1).

(3) And physically combining the first matrix and the second matrix to form a third matrix. In this embodiment, the first matrix and the second matrix are concatenated in the row direction to obtain a third matrix as follows:

，

wherein the blank positions are filled with 0.

(4) The dimensions of the third matrix are reduced to obtain a fourth matrix as input data associated with the potential user a.

At step 430, the input data associated with the potential user a is input to the at least one machine-learned classifier. In the example of fig. 4, while three classifiers in the form of neural networks are shown, this is merely exemplary and illustrative. In other embodiments, more or fewer (e.g., one) classifiers, or other types of classifiers, may be used.

The method 400 involves an execution phase of a trained machine learning classifier, rather than a training phase, and therefore, no target output data is required at step 430. Instead, the machine learning classifier will generate output data based on the input data indicating whether the preferences of the potential user a match the target content (coupons for the movie XYZ). For example, output data of 1 or a value close to 1 indicates that the potential user a likes a coupon of the movie XYZ, and output data of 0 or a value close to 0 indicates that the potential user a does not like a coupon of the movie XYZ.

At step 440, pushing coupons for the movie XYZ to the client of the potential user a is selectively initiated depending on the output data generated by the at least one machine learning classifier. Specifically, if the output data generated by each of the at least one machine learning classifier indicates that the potential user a likes coupons for movie XYZ, coupons for movie XYZ are pushed to clients of the potential user a, and if the output data generated by any of the at least one machine learning classifier indicates that the potential user a does not like coupons for movie XYZ, coupons for movie XYZ are not pushed to clients of the potential user a. It will be appreciated that while in fig. 4 the output of each classifier is provided to a "multiplier" for operation, this is merely exemplary and illustrative. In other embodiments, other logic may be employed to determine whether the output data of each machine-learned classifier indicates that the preferences of the potential user a match the target content.

Fig. 5 shows a schematic user interface at the client of potential user a. In this example, potential user a is predicted to like a coupon for movie XYZ, and receives the coupon on his or her client.

Since the machine learning classifier considers not only the attribute and preference of the potential user a itself but also the social relationship of the potential user a (attribute and preference of his or her friends) when classifying, it has more judgment dimensions, thereby improving the accuracy of classification. In this way, the target content is pushed to potential users whose preferences truly match it. This improves the pertinence and efficiency of content delivery. In particular, where multiple machine-learned classifiers are employed, the accuracy of the predictions of whether the preferences of the potential user match the target content may be further improved due to the voting decisions of the multiple machine-learned classifiers.

It will be appreciated that the 8 personal attributes described in the above embodiments are exemplary, and that any suitable personal attribute may be used in other embodiments. It will also be appreciated that although in the above embodiments the contents 1 to n and the target content to be pushed are described as coupons, the present invention is not limited thereto. In other embodiments, the content 1 through n and the target content to be pushed may be any type of content, such as video, audio, pictures, text, and the like. More specifically, the content may be information describing, for example, news events, program trailers, movie clips, merchandise introductions, promotional campaigns, and the like. In addition, the contents 1 to n and the target content to be pushed need not be of the same type. For example, the contents 1 to n and the target content may have different types in video, audio, picture, and text. Also, the contents 1 to n and the target content may relate to different subjects. For example, one or more of them may be about news events, one or more of them may be about television programming, and one or more of them may be about merchandise introductions.

Fig. 6 shows a schematic block diagram of a content pushing device 600 according to an embodiment of the present invention. Referring to fig. 6, the content pushing device 600 includes an acquisition module 610, a construction module 620, an input module 630, and a pushing module 640.

The acquisition module 610 is configured to acquire personalized data and social relationship data of potential users. The operation of the acquisition module 610 has been described in detail above with respect to the method embodiment illustrated in connection with fig. 4 and is not repeated here for the sake of brevity.

The construction module 620 is configured to construct input data associated with the potential user based on the acquired personalized data and social relationship data of the potential user. The operation of the build module 620 has been described in detail above with respect to the method embodiment illustrated in connection with FIG. 4 and is not repeated here for the sake of brevity.

The input module 630 is configured to input the input data associated with the potential user to at least one machine learning classifier. The operation of the input module 630 has been described in detail above with respect to the method embodiment illustrated in connection with fig. 4 and is not repeated here for the sake of brevity.

The pushing module 640 is configured to selectively push the target content to clients of the potential users depending on the output data generated by the at least one machine-learned classifier. The operation of the push module 640 has been described in detail above with respect to the method embodiment illustrated in connection with fig. 4 and is not repeated here for the sake of brevity.

It will be appreciated that the acquisition module 610, the construction module 620, the input module 630, and the push module 640 may be implemented by software, firmware, hardware, or a combination thereof, as will be further described below.

FIG. 7 illustrates an example system 700 that includes an example computing device 710, a network 740, and a plurality of clients 750 in communication with the computing device 710 via the network 740, which represent one or more systems and/or devices that may implement various techniques described herein.

The network 740 may be a variety of different networks including the Internet, a Local Area Network (LAN), a telephone network, an intranet, other public and/or proprietary networks, combinations thereof, and so forth.

The client 750 may be a variety of different types of devices, such as a desktop computer, a server computer, a notebook or netbook computer, a mobile device (e.g., a tablet or phablet device, a cellular or other wireless telephone (e.g., a smart phone), a notepad computer, a mobile station), a wearable device (e.g., glasses, a watch), an entertainment device (e.g., an entertainment appliance, a set-top box communicatively coupled to a display device, a game console), a television or other display device, an automobile computer, and so forth, or an application running on the various different types of devices.

Computing device 710 may be, for example, a server of a service provider or any other suitable computing device or computing system, ranging from full resource devices with substantial memory and processor resources to low resource devices with limited memory and/or processing resources. In some embodiments, the content pushing apparatus 600 described above with respect to fig. 6 may take the form of a computing device 710.

The example computing device 710 as illustrated includes a processing system 711, one or more computer-readable media 712, and one or more I/O interfaces 713 communicatively coupled to each other. Although not shown, computing device 710 may also include a system bus or other data and command transfer system that couples the various components to one another. A system bus may include any one or combination of different bus structures, such as a memory bus or memory controller, a peripheral bus, a universal serial bus, and/or a processor or local bus that utilizes any of a variety of bus architectures. Various other examples are also contemplated, such as control and data lines.

The processing system 711 is representative of functionality to perform one or more operations using hardware. Thus, the processing system 711 is illustrated as including hardware elements 714 that may be configured as processors, functional blocks, and the like. This may include implementation in hardware as an application specific integrated circuit or other logic device formed using one or more semiconductors. The hardware element 714 is not limited by the material from which it is formed or the processing mechanism employed therein. For example, the processor may be comprised of semiconductor(s) and/or transistors (e.g., electronic Integrated Circuits (ICs)). In such a context, the processor-executable instructions may be electronically-executable instructions.

Computer-readable medium 712 is illustrated as including memory/storage 715. Memory/storage 715 represents memory/storage capacity associated with one or more computer-readable media. The memory/storage 715 may include volatile media (such as Random Access Memory (RAM)) and/or nonvolatile media (such as Read Only Memory (ROM), flash memory, optical disks, magnetic disks, and so forth). The memory/storage 715 may include fixed media (e.g., RAM, ROM, a fixed hard drive, etc.) and removable media (e.g., flash memory, a removable hard drive, an optical disk, and so forth). The computer-readable medium 712 may be configured in a variety of other ways as described further below.

One or more input/output interfaces 713 represent functionality that allows a user to enter commands and information to computing device 710, and that also allows information to be presented to the user and/or sent to other components or devices using various input/output devices. Examples of input devices include a keyboard, a cursor control device (e.g., a mouse), a microphone (e.g., for voice input), a scanner, touch functionality (e.g., capacitive or other sensors configured to detect physical touches), a camera (e.g., motion that does not involve touches may be detected as gestures using visible or invisible wavelengths such as infrared frequencies), a network card, a receiver, and so forth. Examples of output devices include a display device (e.g., a monitor or projector), speakers, a printer, a haptic response device, a network card, a transmitter, and so forth.

Computing device 710 also includes content push policy 716. The content push policy 716 may be stored as computer program instructions in the memory/storage 715. The content push policy 716 may implement all of the functionality of the various modules of the content push device 600 described with respect to FIG. 6 in conjunction with the processing system 711 and the I/O interface 713.

Various techniques may be described herein in the general context of software hardware elements or program modules. Generally, these modules include routines, programs, objects, elements, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The terms "module," "functionality," and "component" as used herein generally represent software, firmware, hardware, or a combination thereof. The features of the techniques described herein are platform-independent, meaning that the techniques may be implemented on a variety of computing platforms having a variety of processors.

An implementation of the described modules and techniques may be stored on or transmitted across some form of computer readable media. Computer-readable media can include a variety of media that are accessible by computing device 710. By way of example, and not limitation, computer readable media may comprise "computer readable storage media" and "computer readable signal media".

"computer-readable storage medium" refers to a medium and/or device that can permanently store information and/or a tangible storage device, as opposed to a mere signal transmission, carrier wave, or signal itself. Thus, computer-readable storage media refers to non-signal bearing media. Computer-readable storage media include hardware such as volatile and nonvolatile, removable and non-removable media and/or storage devices implemented in methods or techniques suitable for storage of information such as computer-readable instructions, data structures, program modules, logic elements/circuits or other data. Examples of a computer-readable storage medium may include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical storage, hard disk, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or other storage devices, tangible media, or articles of manufacture adapted to store the desired information and which may be accessed by a computer.

"computer-readable signal medium" refers to a signal bearing medium configured to transmit instructions to hardware of computing device 710, such as via a network. Signal media may typically be embodied in computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave, data signal, or other transport mechanism. Signal media also include any information delivery media. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media includes wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media.

As previously described, hardware elements 714 and computer-readable media 712 represent instructions, modules, programmable device logic, and/or fixed device logic implemented in hardware that may be used in some embodiments to implement at least some aspects of the techniques described herein. The hardware elements may include integrated circuits or components of a system on a chip, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs), complex Programmable Logic Devices (CPLDs), and other implementations in silicon or other hardware devices. In this context, the hardware elements may be implemented as processing devices that perform program tasks defined by instructions, modules, and/or logic embodied by the hardware elements, as well as hardware devices that store instructions for execution, such as the previously described computer-readable storage media.

Combinations of the foregoing may also be used to implement the various techniques and modules described herein. Accordingly, software, hardware, or program modules, and other program modules may be implemented as one or more instructions and/or logic embodied on some form of computer readable storage medium and/or by one or more hardware elements 714. Computing device 710 may be configured to implement particular instructions and/or functions corresponding to software and/or hardware modules. Thus, for example, by using the computer-readable storage medium of the processing system and/or the hardware elements 714, the modules may be implemented at least in part in hardware as modules executable by the computing device 710 as software. The instructions and/or functions may be executable/operable by one or more articles of manufacture (e.g., one or more computing devices 710 and/or processing systems 711) to implement the techniques, modules, and examples described herein.

The techniques described herein may be supported by these various configurations of computing device 710 and are not limited to the specific examples of techniques described herein. The functionality of computing device 710 may also be implemented in whole or in part on "cloud" 720 using a distributed system, such as through platform 730 as described below.

Cloud 720 includes and/or represents platform 730 for resource 732. Platform 730 abstracts underlying functionality of hardware (e.g., servers) and software resources of cloud 720. Resources 732 may include applications and/or data that may be used when executing computer processing on servers remote from computing device 710. Resources 732 may also include services provided over the internet and/or over subscriber networks such as cellular or Wi-Fi networks.

Platform 730 may abstract resources and functionality to connect computing device 710 with other computing devices. Platform 730 may also be used to abstract a hierarchy of resources to provide a corresponding level of hierarchy of encountered demand for resources 732 implemented via platform 730. Thus, in an interconnect device embodiment, implementation of the functionality described herein may be distributed throughout system 700. For example, the functionality may be implemented in part on computing device 710 and by platform 730 abstracting the functionality of cloud 720.

Variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the indefinite article "a" or "an" does not exclude a plurality. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A content pushing method, comprising:

acquiring personalized data and social relationship data of potential users;

constructing input data associated with the potential user based on the acquired personalized data and social relationship data;

inputting the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the potential user's preferences match target content based on the input data; and

selectively initiate pushing of the target content to a client of the potential user depending on output data generated by the at least one machine-learning classifier;

Wherein each of the at least one machine-learned classifier is trained by:

acquiring personalized data and social relationship data of each of a plurality of known users;

constructing respective input data respectively associated with the plurality of known users based on the acquired respective personalized data and social relationship data;

providing respective target output data associated with the plurality of known users, wherein the target output data associated with each known user indicates whether the preferences of that known user match the target content; and

inputting the respective input data and the respective target output data into each machine learning classifier for training;

wherein said constructing respective input data associated with said plurality of known users, respectively, comprises:

for each known user:

generating a first matrix based on the personalized data of the known user;

generating a second matrix based on the personalized data and social relationship data of the known user, wherein the second matrix describes respective similarities between a plurality of friends of the known user and the known user in terms of personal attributes and respective similarities between the plurality of friends and the known user in terms of personal preferences for n pieces of content, n being a natural number, and wherein the n pieces of content are different from the target content;

Concatenating the first matrix and the second matrix to form a third matrix; and

the third matrix is reduced in dimension to obtain a fourth matrix as input data associated with the known user.

2. The method of claim 1, wherein the selectively initiating pushing the targeted content to the client of the potential user comprises:

pushing the target content to a client of the potential user in response to the output data generated by each of the at least one machine-learned classifier indicating that the potential user's preferences match the target content; and

the target content is not pushed to a client of the potential user in response to output data generated by any of the at least one machine-learned classifier indicating that the preference of the potential user does not match the target content.

3. The method according to claim 1,

wherein the personalized data for each known user comprises:

m attribute values describing respective ones of the m attributes of the known user; and

n preference values indicating whether the preference of the known user matches the corresponding content of the n contents, m being a natural number, and

Wherein the social relationship data for each known user includes personalized data for each of a plurality of friends of the known user, the personalized data for each friend including:

m attribute values describing a corresponding attribute of the m attributes of the friend; and

n preference values indicating whether the friend's preferences match corresponding ones of the n content.

4. The method of claim 3, wherein the generating a first matrix based on the personalization data of the known user comprises: and vectorizing the m attribute numerical values of the personalized data of the known user into a row vector serving as the first matrix.

5. The method of claim 3, wherein the generating a second matrix based on the personalized data and social relationship data of the known user comprises:

deriving respective similarities in personal attributes between the plurality of friends and the known user from respective m attribute values of the personalized data of the plurality of friends and the m attribute values of the personalized data of the known user;

deriving respective similarities between the plurality of friends and the known user in terms of personal preferences for each of the n content from the respective n preference values of the personalized data of the plurality of friends and the n preference values of the personalized data of the known user;

Vectorizing the derived similarity between the plurality of friends and the known user in terms of personal attributes to a first column vector;

vectorizing the derived respective similarity between the plurality of friends and the known user in terms of personal preferences for each of the n content into a respective second column vector; and

concatenating the first column vector and each of the second column vectors in a row direction to form the second matrix.

6. The method of claim 5, wherein said deriving respective similarities in personal attributes between the plurality of friends and the known user comprises:

comparing the m attribute values of the personalized data of each friend with corresponding ones of the m attribute values of the personalized data of the known user;

counting the number of comparison results indicating equality; and

the ratio of the number to m is determined as a similarity between the friend and the known user in terms of personal attributes.

7. The method of claim 5, wherein said deriving respective similarities between the plurality of friends and the known user in terms of personal preferences for each of the n pieces of content comprises:

Comparing the n preference values of the personalized data of each friend with corresponding ones of the n preference values of the personalized data of the known user;

in response to the comparison indicating that the friend and the known user have the same preference for the same content of the n content, setting a similarity between the friend and the known user in terms of personal preferences for the content to a predetermined value; and

in response to the comparison indicating that the friend and the known user have different preferences for the same content of the n content, a similarity between the friend and the known user in terms of personal preferences for the content is set to zero.

8. The method of claim 5, wherein the concatenating the first matrix and the second matrix comprises: the first matrix and the second matrix are concatenated in a row direction to form the third matrix.

9. The method of claim 1, wherein the reducing the dimension of the third matrix comprises: and carrying out principal component analysis on the third matrix.

10. The method of claim 1, wherein the target output data for each known user comprises a numerical value indicating whether the preference of the known user matches the target content.

11. A content pushing apparatus comprising:

the acquisition module is configured to acquire personalized data and social relationship data of potential users;

a building module configured to build input data associated with the potential user based on the acquired personalized data and social relationship data;

an input module configured to input the input data to at least one machine-learned classifier, wherein each of the at least one machine-learned classifier is configured to generate output data indicating whether the preferences of the potential user match target content based on the input data; and

a pushing module configured to selectively push the target content to a client of the potential user depending on output data generated by the at least one machine learning classifier;

wherein each of the at least one machine-learned classifier is trained by:

for each known user:

generating a first matrix based on the personalized data of the known user;

12. A computing device comprising a memory and a processor, the memory configured to store thereon computer program instructions that, when executed on the processor, cause the processor to perform the method of any of claims 1-10.

13. A computer readable storage medium having stored thereon computer program instructions which, when executed on a processor, cause the processor to perform the method of any of claims 1-10.