CN115048504A - Information pushing method and device, computer equipment and computer readable storage medium - Google Patents
Information pushing method and device, computer equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN115048504A CN115048504A CN202210535910.4A CN202210535910A CN115048504A CN 115048504 A CN115048504 A CN 115048504A CN 202210535910 A CN202210535910 A CN 202210535910A CN 115048504 A CN115048504 A CN 115048504A
- Authority
- CN
- China
- Prior art keywords
- content
- recommended
- target user
- probability
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 239000013598 vector Substances 0.000 claims description 54
- 230000008451 emotion Effects 0.000 claims description 49
- 238000013145 classification model Methods 0.000 claims description 17
- 238000012549 training Methods 0.000 claims description 16
- 238000004891 communication Methods 0.000 claims description 15
- 239000011159 matrix material Substances 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 12
- 230000002996 emotional effect Effects 0.000 claims description 6
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000006399 behavior Effects 0.000 description 11
- 238000004590 computer program Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 238000009412 basement excavation Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention relates to the technical field of artificial intelligence, and discloses an information pushing method, which comprises the following steps: classifying the watched contents of all users to obtain a plurality of content classifications; calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity of the content to be recommended and the watched content of the target user; calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended; according to the first probability and the second probability, obtaining the heat score of each content to be recommended; and recommending the content to be recommended to a target user according to the popularity score. Through the mode, the content recommendation accuracy is improved.
Description
Technical Field
The embodiment of the invention relates to the technical field of artificial intelligence, in particular to an information pushing method, an information pushing device, computer equipment and a computer readable storage medium.
Background
Currently, content push generally uses a collaborative filtering method to push related content by analyzing similarity of user relationship content or a method of a user friend viewing content, or simply pushes content according to hot content.
The inventor finds that the accuracy of information recommendation is low because the relation between users and content, the relation between users and the effectiveness evaluation of the users on the watched content are not comprehensively considered when the probability of the users watching the content is analyzed.
Disclosure of Invention
In view of the foregoing problems, embodiments of the present invention provide an information pushing method, an information pushing apparatus, a computer device, and a computer-readable storage medium, which are used to solve the technical problem in the prior art that the accuracy of information recommendation is low.
According to an aspect of the embodiments of the present invention, there is provided an information pushing method, including:
classifying the watched contents of all users to obtain a plurality of content classifications;
calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the watched content of the target user; the content to be recommended is any one of the content categories except the content watched by the target user;
calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended;
according to the first probability and the second probability, obtaining the heat score of each content to be recommended;
and recommending the content to be recommended to a target user according to the popularity score.
In an alternative manner, the classifying the watched content of all users to obtain a plurality of content classifications includes: selecting K contents from all watched contents as a classification center; calculating the correlation of all the viewed contents with each classification center; obtaining a correlation matrix among the watched contents according to the correlation; and iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
In an optional manner, the calculating, according to the preference degree of the target user for each content classification and the similarity between the content to be recommended and the watched content of the target user, a first probability that the target user watches each content to be recommended includes: determining each viewed content of a target user and a classification center of each content classification; and calculating the preference degree of the target user to each content classification according to each watched content of the target user and each classification center of the content classification.
In an optional manner, calculating a first probability that the target user watches each piece of content to be recommended according to the preference degree of the target user for each piece of content and the similarity between the piece of content to be recommended and the watched content of the target user includes: determining a first feature vector of viewed content of the target user; determining a second feature vector of the content to be recommended; and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: acquiring the watching information of all users; obtaining a feature vector of each user according to the viewing information and the word vector model; and clustering according to the characteristic vectors to obtain the user category to which each user belongs.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: according to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended; calculating Euclidean distances between the target user and other users in the user category to which the target user belongs; and calculating a second probability that each content to be recommended is recommended to the target user according to the Euclidean distance and the emotion classification.
In an optional manner, the determining, according to comment information of the user category to which the target user belongs about the content to be recommended, an emotional classification of other users in the user category to which the target user belongs about the content to be recommended includes: obtaining comment information of the user category to which the target user belongs to the content to be recommended; inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
According to another aspect of the embodiments of the present invention, there is provided an information pushing apparatus, including:
the content classification module is used for classifying the watched contents of all the users to obtain a plurality of content classifications;
the first probability calculation module is used for calculating first probabilities of the target users for watching the contents to be recommended according to the preference degrees of the target users for the contents in each category and the similarity between the contents to be recommended and the watched contents of the target users; the content to be recommended is any one of the content categories except the content watched by the target user;
the second probability calculation module is used for calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended;
the popularity scoring module is used for obtaining popularity scores of the contents to be recommended according to the first probability and the second probability;
and the recommending module is used for recommending the content to be recommended to a target user according to the popularity score.
According to another aspect of embodiments of the present invention, there is provided a computer apparatus including: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation of the information pushing method.
According to another aspect of the embodiments of the present invention, there is provided a computer-readable storage medium, in which at least one executable instruction is stored, and when the executable instruction is executed on a computer device, the computer device is caused to perform the operations of the information pushing method described above.
The method and the device have the advantages that the watched contents of all users are classified to obtain a plurality of content classifications, then the first probability of watching each content to be recommended by the target user is calculated according to the preference degree of the user to each content classification and the similarity of the content to be recommended and the watched contents of the target user, the second probability of recommending each content to be recommended to the target user is calculated according to the comment information of the user classification to which the target user belongs to the content to be recommended, the heat degree score of each content to be recommended is obtained according to the first probability and the second probability, and finally the content to be recommended is recommended to the target user according to the heat degree score, so that the accuracy of information recommendation can be effectively improved.
The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and in order that the technical solutions of the embodiments of the present invention can be clearly understood, the embodiments of the present invention can be implemented according to the content of the description, and the above and other objects, features, and advantages of the embodiments of the present invention can be more clearly understood, the detailed description of the present invention is provided below.
Drawings
The drawings are only for purposes of illustrating embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 shows a flow chart of an information pushing method provided by an embodiment of the present invention;
fig. 2 shows a schematic structural diagram of an information pushing apparatus provided by an embodiment of the present invention;
fig. 3 shows a schematic structural diagram of a computer device provided by an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein.
At present, content push is to push related content in a collaborative filtering manner by analyzing similarity of user relationship content or a manner that a user friend has watched content, or to push content simply according to hot content. In the existing scheme, when the content feature vector is obtained, the feature vector of the content is generated mainly through user scoring, and the characteristics of content playing time, playing training and the like are not considered. When the hot content is obtained, the number of the hot words in the content is simply counted, and the content without the hot words cannot be counted. Or calculating hot content according to the search behavior of the user, and also counting based on the quantity. The hotspot degree of the content cannot be comprehensively reflected by the generation of the hotspot content based on the quantity statistics, and the hotspot degree of the content can be reflected by the watching behavior of the user. The user-to-content, the user-to-user relationship, and the user's effectiveness evaluation of the viewed content are not taken into account in analyzing the probability of the user viewing the content.
Fig. 1 shows a flowchart of an information pushing method provided by an embodiment of the present invention, where the method is executed by a computer device. The computer device may be a computer, a tablet computer, a mobile phone, a watch, an audio/video playing device, a wearable device, and the like, and the embodiment of the present invention is not particularly limited. As shown in fig. 1, the method comprises the steps of:
step 110: the watched content of all users is classified to obtain a plurality of content classifications.
Before classifying the watched contents of all users, the embodiment of the invention also obtains the user behavior data in advance, and obtains the watched contents corresponding to all users after processing the user behavior data. Specifically, the user behavior log is sent to the kafka through the service background, the user behavior data of the user behavior log in the kafka (distributed streaming media platform) is collected through the flash (log collection system), and the user behavior data is filtered and extracted. An interceptor may be set in the flash to filter abnormal data in the user behavior data, for example, a user name is null, a field value is abnormal, and the like. After filtering, extracting data information through Spark sampling (a real-time computing frame built on Spark), firstly acquiring data information of a user watching behavior, then extracting watching information such as user id, watched content id, content playing time and the like from the watching behavior data, and finally storing the acquired watching information into a hive (data warehouse tool) database.
After the watched contents of all users are obtained, the watched contents are converted into the feature vectors V of the watched contents through a word vector model, and after the feature vectors V of all the watched contents are obtained, all the watched contents are classified through the feature vectors V, so that a plurality of content classifications are obtained. The method comprises the following steps:
a. and selecting K contents from all the viewed contents as classification centers.
b. Calculating the relevance of all viewed content to each of the classification centers. Utensil for cleaning buttockSpecifically, the ith viewed content c in the viewed contents of all users is calculated according to the selected classification center i To each classification centre v j The correlation of (a):
wherein p is ij Viewed content c for the ith all users i With the jth classification center v j The correlation of (c); d (c) i ,v j ) For the ith viewed content c among the viewed contents of all users i With the jth classification center v j The Euclidean distance of (c); t is a variable and takes the value of 0-k.
c. And obtaining a correlation matrix among the watched contents according to the correlation. The correlation matrix is a k × n matrix with the viewed content as columns and the classification center as rows, and n is the number of the viewed contents of all users.
d. And iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
After the correlation matrix is obtained, updating the classification center, wherein the updating method comprises the following steps:
wherein p is ij Representing the degree of correlation between the ith viewed content and the jth viewed content as a classification center; d j N represents the number of the watched contents of all the users as the characteristic vector of the watched content j; i denotes the ith viewed content.
And through iterative updating of the classification center, when the Euclidean distance between the classification center point obtained at the current time and the classification center obtained at the previous iteration is smaller than a threshold value e, determining that the iteration is stopped, and obtaining k content classifications.
Step 120: and calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the watched content of the target user.
The content to be recommended is any one of the content categories except the content watched by the target user.
After k content classifications are obtained, determining each watched content of a target user and a classification center of each content classification; and calculating the preference degree of the target user to each content classification according to each watched content of the target user and the classification center of each content classification. Specifically, the calculation method may be:
wherein, I (u, c) i ) Represents the target user u to the ith content classification c in the k content classifications i The degree of preference of; s represents the total number of viewed contents corresponding to the target user u; x j Represents the jth viewed content of the target user u.
In the embodiment of the present invention, the calculating the similarity between the content to be recommended and each watched content of the target user u further includes: determining a first feature vector of viewed content of the target user; determining a second feature vector of the content to be recommended; and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector. Specifically, the calculation formula may be:
wherein, W ij Representing the similarity of the classification of the ith watched content and the jth content to be recommended in the s watched contents of the target user u; v im A feature vector representing the ith viewed content of target user u; v im And representing the characteristic vector of the jth content to be recommended. In respectively obtaining user preferencesAnd after the degree and the similarity between the content to be recommended and the watched content of the target user, calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the watched content of the target user. The method specifically comprises the following steps:
wherein p is uj1 A first probability of watching the jth content to be recommended for the target user u; s is the number of the watched content corresponding to the target user u; w ij The similarity between the ith watched content of the target user u and the jth content to be recommended. When the content to be recommended is the watched content of the target user u, P uj1 1, i.e. the viewing probability is 1. c. C j And the classification center classifies the content to which the jth content to be recommended belongs.
Step 130: and calculating a second probability that each content to be recommended is recommended to the target user according to the comment information of the user category to which the target user belongs to the content to be recommended.
In the embodiment of the present invention, classifying users in advance includes: acquiring the viewing information of all users, and acquiring the feature vector of each user according to the viewing information and the word vector model; and clustering according to the characteristic vectors to obtain the user category to which each user belongs. Specifically, it may be that training of word2Vec results in feature vectors W for all users. After the user feature vector W is obtained, all feature vectors are clustered, and specifically, all users are calculated by using the same steps a to d similar to the above content classification to perform group classification, so as to obtain c user categories.
After the user category to which the target user belongs is obtained, according to comment information of the user category to which the target user belongs on the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs on the content to be recommended. And calculating Euclidean distances between the target user and other users in the user category to which the target user belongs, and calculating a second probability that each content to be recommended is recommended to the user according to the Euclidean distances and the emotion classification. According to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended, and specifically realizing the emotion classification by the following method: obtaining comment information of the user category to which the user belongs to the content to be recommended; inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
In the embodiment of the invention, the process of training the emotion classification model comprises the following steps: and (3) training the emotion classification model by adopting a BERT (natural language understanding open source pre-training model). In the pretraining process of BERT, the mode of randomly erasing one or more words in the Masked LM task is modified into Mask specific words, and the words can be learned by using the Masked LM to express the words in the context. The calculation method is as follows: a substep: first, a part of seed words need to be selected, and the part of speech (positive or negative) of the words needs to be labeled. For example, positive words are liked, good, etc., negative words are disliked, not good, embarrassed, etc. And a substep b, mining more emotional attribute words through the selected seed words. The excavation mode is as follows: all the comment contents are first participled, and an open source word segmentation tool (Stanford CoreNLP) is used for word segmentation and the part of speech of each word is obtained. And acquiring all the tangible words after acquiring all the word sets and the part of speech of each word. Then, the relevance of all adjectives to the selected seed word is calculated. The calculation formula is as follows:wherein, P (w) 1 ,w 2 ) Meaning word w 1 And w 2 Probability of simultaneous occurrence; p (w) 1 ) Meaning word w 1 The probability of occurrence; p (w2) represents the probability of occurrence of the word w 2. Respectively calculated by the formulaAnd calculating the relevance of all the words and the selected seed words to obtain the relevance PP with the positive words and the relevance PN with the negative words. Then, a difference between PP and PN (PP-PN) is calculated, and if the difference is positive, the adjective is positive, and if the difference is negative, the adjective is negative, so that all positive and negative adjectives are mined from all words. And a substep c, screening out nouns and adjectives of each sentence of comments according to the parts of speech of all the words obtained in the substep b, and forming word pairs by the nouns and the adjectives. And step d, pre-training the sample comment information without the special symbols in the training data through a BERT pre-training model. BERT is a multi-task model whose tasks are composed of two self-supervision tasks, namely MLM and NSP. In the MLM task, adjectives with parts of speech and noun-adjective word pairs obtained in the substep b and the substep c are subjected to mask, and the rest training steps are the same as the training process of the BERT model, so that the well-trained emotion classification model is obtained.
After the emotion classification model is obtained, the emotion classification of each user on the comment content of each watched content (namely, the content to be recommended) is calculated according to the obtained emotion classification model, wherein if the emotion classification is positive, recommendation is performed, and if the emotion classification is negative, non-recommendation is performed.
After determining the emotion classification, calculating the Euclidean distance u between the target user u and other users b in the user category to which the target user u belongs ub And calculating a second probability that each content to be recommended is recommended to the target user according to the Euclidean distance and the emotion classification. The specific calculation formula can be expressed as:
wherein p is uj2 A second probability of recommending jth content to be recommended to the target user u; u shape ub The Euclidean distance between a target user u and a user b; r is the emotion classification of the content j to be recommended by the user b,wherein the recommendation is 1, the non-recommendation is-1, and no comment is 0. And q is the number of users in the user classification to which the target user u belongs.
Step 140: and obtaining the heat score of each content to be recommended according to the first probability and the second probability.
After the first probability and the second probability are obtained, determining the target probability of the user u watching the jth content to be recommended according to the following calculation formula:
after the target probability is obtained, determining the heat score of each content to be recommended according to the target probability, wherein the specific heat score can be calculated according to the following formula:
wherein, Score j And C is the heat score of the jth content to be recommended, and the total number of all users.
Step 150: and recommending the content to be recommended to a target user according to the popularity score.
After the popularity scores of the contents to be recommended are obtained, the contents to be recommended with high popularity can be determined according to the popularity scores, and the contents to be recommended with high popularity are recommended to the corresponding target users u. The contents to be recommended can be ranked according to the popularity score, and the contents to be recommended of the top TOPN are recommended to the corresponding target users.
According to the method and the device, a plurality of content classifications are obtained by classifying the watched content of all users, then the first probability of watching each content to be recommended by a target user is calculated according to the preference degree of the target user to each content classification and the similarity of the content to be recommended and the watched content of the target user, the second probability of recommending each content to be recommended to the target user is calculated according to the comment information of the user class to which the target user belongs to the content to be recommended, the heat score of each content to be recommended is obtained according to the first probability and the second probability, and finally the content to be recommended is recommended to the target user according to the heat score, so that the accuracy of information recommendation can be effectively improved.
Fig. 2 shows a schematic structural diagram of an information pushing apparatus provided in an embodiment of the present invention. As shown in fig. 2, the apparatus 200 includes:
a content classification module 210, configured to classify watched content of all users to obtain a plurality of content classifications;
a first probability calculation module 220, configured to calculate a first probability that a target user watches each content to be recommended according to a preference degree of the target user for each content category and a similarity between the content to be recommended and watched content of the target user; the content to be recommended is any one of the content categories except the content watched by the target user;
the second probability calculation module 230 is configured to calculate, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of content to be recommended is recommended to the target user;
the popularity scoring module 240 is configured to obtain a popularity score of each content to be recommended according to the first probability and the second probability;
and the recommending module 250 is used for recommending the content to be recommended to a target user according to the heat degree score.
In an alternative manner, the classifying the watched content of all users to obtain a plurality of content classifications includes: selecting K contents from all watched contents as a classification center; calculating the correlation of all the viewed contents and each classification center; obtaining a correlation matrix among the watched contents according to the correlation; and iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
In an optional manner, the calculating, according to the preference degree of the target user for each content classification and the similarity between the content to be recommended and the content already watched by the target user, a first probability that the target user watches each content to be recommended includes: determining each viewed content of a target user and a classification center of each content classification; and calculating the preference degree of the target user to each content classification according to each watched content of the target user and the classification center of each content classification.
In an optional manner, calculating a first probability that the target user watches each piece of content to be recommended according to the preference degree of the target user for each piece of content and the similarity between the piece of content to be recommended and the watched content of the target user includes: determining a first feature vector of viewed content of the target user; determining a second feature vector of the content to be recommended; and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: acquiring the watching information of all users; obtaining a feature vector of each user according to the viewing information and the word vector model; and clustering according to the characteristic vectors to obtain the user category to which each user belongs.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: according to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended; calculating Euclidean distances between the target user and other users in the user category to which the target user belongs; and calculating a second probability that each content to be recommended is recommended to the target user according to the Euclidean distance and the emotion classification.
In an optional manner, the determining, according to comment information of the user category to which the target user belongs about the content to be recommended, an emotional classification of other users in the user category to which the target user belongs about the content to be recommended includes: obtaining comment information of the user category to which the target user belongs to the content to be recommended; inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
The specific working process of the information pushing apparatus according to the embodiment of the present invention is substantially the same as the specific flow steps of the above method embodiments, and details are not repeated here.
According to the method and the device, a plurality of content classifications are obtained by classifying the watched content of all users, then the first probability of watching each content to be recommended by a target user is calculated according to the preference degree of the target user to each content classification and the similarity of the content to be recommended and the watched content of the target user, the second probability of recommending each content to be recommended to the target user is calculated according to the comment information of the user class to which the target user belongs to the content to be recommended, the heat score of each content to be recommended is obtained according to the first probability and the second probability, and finally the content to be recommended is recommended to the target user according to the heat score, so that the accuracy of information recommendation can be effectively improved.
Fig. 3 is a schematic structural diagram of a computer device according to an embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computer device.
As shown in fig. 3, the computer device may include: a processor (processor)402, a Communications Interface 404, a memory 406, and a Communications bus 408.
Wherein: the processor 402, communication interface 404, and memory 406 communicate with each other via a communication bus 408. A communication interface 404 for communicating with network elements of other devices, such as clients or other servers. The processor 402 is configured to execute the program 410, and may specifically perform the relevant steps in the embodiment of the information pushing method described above.
In particular, program 410 may include program code comprising computer-executable instructions.
The processor 402 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computer device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 406 for storing a program 410. Memory 406 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 410 may specifically be invoked by the processor 402 to cause the computer device to perform the following operations:
classifying the watched contents of all users to obtain a plurality of content classifications;
calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the watched content of the user; the content to be recommended is any one of the content categories except the content watched by the target user;
calculating a second probability that each content to be recommended is recommended to the user according to comment information of the user category to which the target user belongs on the content to be recommended;
according to the first probability and the second probability, obtaining the heat score of each content to be recommended;
and recommending the content to be recommended to a target user according to the popularity score.
In an alternative manner, the classifying the watched content of all users to obtain a plurality of content classifications includes: selecting K contents from all watched contents as a classification center; calculating the correlation of all the viewed contents with each classification center; obtaining a correlation matrix among the watched contents according to the correlation; and iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
In an optional manner, the calculating a first probability that the target user watches each piece of content to be recommended according to the preference degree of the target user for each piece of content and the similarity between the piece of content to be recommended and the watched content of the user includes: determining each viewed content of a target user and a classification center of each content classification; and calculating the preference degree of the target user to each content classification according to each watched content of the target user and the classification center of each content classification.
In an optional manner, calculating a first probability that the target user watches each piece of content to be recommended according to the preference degree of the target user for each piece of content and the similarity between the piece of content to be recommended and the watched content of the target user includes: determining a first feature vector of viewed content of the target user; determining a second feature vector of the content to be recommended; and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: acquiring the watching information of all users; obtaining a feature vector of each user according to the viewing information and the word vector model; and clustering according to the characteristic vectors to obtain the user category to which each user belongs.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: according to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended; calculating Euclidean distances between the user and other users in the user category to which the user belongs; and calculating a second probability that each content to be recommended is recommended to the target user according to the Euclidean distance and the emotion classification.
In an optional manner, the determining, according to comment information of the user category to which the target user belongs about the content to be recommended, an emotional classification of other users in the user category to which the target user belongs about the content to be recommended includes: obtaining comment information of the user category to which the target user belongs to the content to be recommended; inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
According to the method and the device, a plurality of content classifications are obtained by classifying the watched content of all users, then the first probability of watching each content to be recommended by a target user is calculated according to the preference degree of the target user to each content classification and the similarity of the content to be recommended and the watched content of the target user, the second probability of recommending each content to be recommended to the target user is calculated according to the comment information of the user class to which the target user belongs to the content to be recommended, the heat score of each content to be recommended is obtained according to the first probability and the second probability, and finally the content to be recommended is recommended to the target user according to the heat score, so that the accuracy of information recommendation can be effectively improved.
An embodiment of the present invention provides a computer-readable storage medium, where the storage medium stores at least one executable instruction, and when the executable instruction runs on a computer device, the computer device is enabled to execute an information pushing method in any method embodiment described above.
The executable instructions may be specifically configured to cause the computer device to perform the following:
classifying the watched contents of all users to obtain a plurality of content classifications;
calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity of the content to be recommended and the watched content of the target user; the content to be recommended is any one of the content categories except the content watched by the target user;
calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended;
according to the first probability and the second probability, obtaining the heat score of each content to be recommended;
and recommending the content to be recommended to a target user according to the popularity score.
In an alternative manner, the classifying the watched content of all users to obtain a plurality of content classifications includes: selecting K contents from all watched contents as a classification center; calculating the correlation of all the viewed contents with each classification center; obtaining a correlation matrix among the watched contents according to the correlation; and iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
In an optional manner, the calculating, according to the preference degree of the target user for each content category and the similarity between the content to be recommended and the content already viewed by the target user, a first probability that the user views each content to be recommended includes: determining each viewed content of a target user and a classification center of each content classification; and calculating the preference degree of the target user to each content classification according to each watched content of the target user and the classification center of each content classification.
In an optional manner, calculating a first probability that the target user watches each piece of content to be recommended according to the preference degree of the target user for each piece of content and the similarity between the piece of content to be recommended and the watched content of the user includes: determining a first feature vector of viewed content of the target user; determining a second feature vector of the content to be recommended; and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of content to be recommended is recommended to the target user includes: acquiring the watching information of all users; obtaining a feature vector of each user according to the viewing information and the word vector model; and clustering according to the characteristic vectors to obtain the user category to which each user belongs.
In an optional manner, the calculating, according to comment information of the user category to which the target user belongs about the content to be recommended, a second probability that each piece of the content to be recommended is recommended to the target user includes: according to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended; calculating Euclidean distances between the target user and other users in the user category to which the target user belongs; and calculating a second probability that each content to be recommended is recommended to the target user according to the Euclidean distance and the emotion classification.
In an optional manner, the determining, according to comment information of the user category to which the target user belongs about the content to be recommended, an emotional classification of other users in the user category to which the target user belongs about the content to be recommended includes: obtaining comment information of the user category to which the target user belongs to the content to be recommended; inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
The method and the device have the advantages that the watched contents of all users are classified to obtain a plurality of content classifications, then the first probability of watching each content to be recommended by a target user is calculated according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the watched contents of the target user, the second probability of recommending each content to be recommended to the target user is calculated according to the comment information of the user classification to which the target user belongs to the content to be recommended, the heat degree score of each content to be recommended is obtained according to the first probability and the second probability, and finally the content to be recommended is recommended to the target user according to the heat degree score, so that the accuracy of information recommendation can be effectively improved.
The embodiment of the invention provides an information pushing device, which is used for executing the information pushing method.
Embodiments of the present invention provide a computer program, where the computer program can be called by a processor to enable a computer device to execute an information push method in any of the above method embodiments.
The embodiment of the present invention provides a computer program product, where the computer program product includes a computer program stored on a computer-readable storage medium, and the computer program includes program instructions, when the program instructions are run on a computer, the computer is caused to execute the information push method in any of the above method embodiments.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.
Claims (10)
1. An information pushing method, characterized in that the method comprises:
classifying the watched contents of all users to obtain a plurality of content classifications;
calculating a first probability of the target user watching each content to be recommended according to the preference degree of the target user to each content classification and the similarity between the content to be recommended and the content watched by the target user; the content to be recommended is any one of the content categories except the content watched by the target user;
calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended;
according to the first probability and the second probability, obtaining the heat score of each content to be recommended;
and recommending the content to be recommended to a target user according to the popularity score.
2. The method of claim 1, wherein the classifying the viewed content of all users results in a plurality of content classifications, comprising:
selecting K contents from all watched contents as a classification center;
calculating the correlation of all the viewed contents with each classification center;
obtaining a correlation matrix among the watched contents according to the correlation;
and iteratively updating the classification center and the correlation matrix until an iteration threshold is met, and obtaining a plurality of content classifications.
3. The method according to claim 1, wherein the calculating a first probability that the target user watches each content to be recommended according to the preference degree of the target user for each content category and the similarity between the content to be recommended and the watched content of the target user comprises:
determining each viewed content of a target user and a classification center of each content classification;
and calculating the preference degree of the target user to each content classification according to each watched content of the target user and the classification center of each content classification.
4. The method according to any one of claims 1 to 3, wherein calculating the first probability that the target user watches each of the contents to be recommended according to the preference degree of the target user for each of the content categories and the similarity between the contents to be recommended and the watched contents of the target user comprises:
determining a first feature vector of viewed content of the target user;
determining a second feature vector of the content to be recommended;
and determining the similarity between the content to be recommended and the watched content of the target user according to the first feature vector and the second feature vector.
5. The method according to any one of claims 1 to 3, wherein the calculating of the second probability that each content to be recommended is recommended to the target user according to the comment information of the user category to which the target user belongs for the content to be recommended comprises:
acquiring the watching information of all users;
obtaining a feature vector of each user according to the viewing information and the word vector model;
and clustering according to the characteristic vectors to obtain the user category to which each user belongs.
6. The method according to claim 5, wherein the calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the content to be recommended of the user category to which the target user belongs comprises:
according to the comment information of the user category to which the target user belongs to the content to be recommended, determining the emotion classification of other users in the user category to which the target user belongs to the content to be recommended;
calculating Euclidean distances between the target user and other users in the user category to which the target user belongs;
and calculating a second probability that each content to be recommended is recommended to the user according to the Euclidean distance and the emotion classification.
7. The method according to claim 6, wherein the determining the emotional classification of the content to be recommended by other users in the user category to which the target user belongs according to the comment information of the user category to which the target user belongs to the content to be recommended comprises:
obtaining comment information of the user category to which the target user belongs to the content to be recommended;
inputting the comment information into an emotion classification model to obtain emotion classification of other users in the user category to which the target user belongs on the content to be recommended; the emotion classification model is obtained by training according to emotion classification samples in advance.
8. An information pushing apparatus, characterized in that the apparatus comprises:
the content classification module is used for classifying the watched contents of all the users to obtain a plurality of content classifications;
the first probability calculation module is used for calculating first probabilities of the target users for watching the contents to be recommended according to the preference degrees of the target users for the contents in each category and the similarity between the contents to be recommended and the watched contents of the target users; the content to be recommended is any one of the content categories except the content watched by the target user;
the second probability calculation module is used for calculating a second probability that each content to be recommended is recommended to the target user according to comment information of the user category to which the target user belongs on the content to be recommended;
the popularity scoring module is used for obtaining popularity scores of the contents to be recommended according to the first probability and the second probability;
and the recommending module is used for recommending the content to be recommended to the target user according to the popularity score.
9. A computer device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation of the information pushing method according to any one of claims 1-7.
10. A computer-readable storage medium, having at least one executable instruction stored therein, which when executed on a computer device, causes the computer device to perform the operations of the information push method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210535910.4A CN115048504A (en) | 2022-05-17 | 2022-05-17 | Information pushing method and device, computer equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210535910.4A CN115048504A (en) | 2022-05-17 | 2022-05-17 | Information pushing method and device, computer equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115048504A true CN115048504A (en) | 2022-09-13 |
Family
ID=83159804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210535910.4A Pending CN115048504A (en) | 2022-05-17 | 2022-05-17 | Information pushing method and device, computer equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115048504A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116628179A (en) * | 2023-05-30 | 2023-08-22 | 道有道科技集团股份公司 | User operation data visualization and man-machine interaction recommendation method |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334640A (en) * | 2018-03-21 | 2018-07-27 | 北京奇艺世纪科技有限公司 | A kind of video recommendation method and device |
CN111782968A (en) * | 2020-07-02 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Content recommendation method and device, readable medium and electronic equipment |
CN112115321A (en) * | 2019-06-03 | 2020-12-22 | 阿里巴巴集团控股有限公司 | Training method and device of content recommendation model, electronic equipment and storage medium |
CN112860989A (en) * | 2021-01-20 | 2021-05-28 | 平安科技(深圳)有限公司 | Course recommendation method and device, computer equipment and storage medium |
WO2021251806A1 (en) * | 2020-06-10 | 2021-12-16 | Samsung Electronics Co., Ltd. | Content recommendation method and system |
CN114090891A (en) * | 2021-11-24 | 2022-02-25 | 土巴兔集团股份有限公司 | Personalized content recommendation method, device, equipment and storage medium |
CN114461893A (en) * | 2020-11-09 | 2022-05-10 | 腾讯科技(深圳)有限公司 | Information recommendation method, related device, equipment and storage medium |
-
2022
- 2022-05-17 CN CN202210535910.4A patent/CN115048504A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334640A (en) * | 2018-03-21 | 2018-07-27 | 北京奇艺世纪科技有限公司 | A kind of video recommendation method and device |
CN112115321A (en) * | 2019-06-03 | 2020-12-22 | 阿里巴巴集团控股有限公司 | Training method and device of content recommendation model, electronic equipment and storage medium |
WO2021251806A1 (en) * | 2020-06-10 | 2021-12-16 | Samsung Electronics Co., Ltd. | Content recommendation method and system |
CN111782968A (en) * | 2020-07-02 | 2020-10-16 | 北京字节跳动网络技术有限公司 | Content recommendation method and device, readable medium and electronic equipment |
CN114461893A (en) * | 2020-11-09 | 2022-05-10 | 腾讯科技(深圳)有限公司 | Information recommendation method, related device, equipment and storage medium |
CN112860989A (en) * | 2021-01-20 | 2021-05-28 | 平安科技(深圳)有限公司 | Course recommendation method and device, computer equipment and storage medium |
CN114090891A (en) * | 2021-11-24 | 2022-02-25 | 土巴兔集团股份有限公司 | Personalized content recommendation method, device, equipment and storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116628179A (en) * | 2023-05-30 | 2023-08-22 | 道有道科技集团股份公司 | User operation data visualization and man-machine interaction recommendation method |
CN116628179B (en) * | 2023-05-30 | 2023-12-22 | 道有道科技集团股份公司 | User operation data visualization and man-machine interaction recommendation method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110162593B (en) | Search result processing and similarity model training method and device | |
US11816888B2 (en) | Accurate tag relevance prediction for image search | |
CN108073568B (en) | Keyword extraction method and device | |
CN107515877B (en) | Sensitive subject word set generation method and device | |
CN110968684B (en) | Information processing method, device, equipment and storage medium | |
Meng et al. | Leveraging concept association network for multimedia rare concept mining and retrieval | |
US8499008B2 (en) | Mixing knowledge sources with auto learning for improved entity extraction | |
US9218364B1 (en) | Monitoring an any-image labeling engine | |
US20170236055A1 (en) | Accurate tag relevance prediction for image search | |
WO2015165372A1 (en) | Method and apparatus for classifying object based on social networking service, and storage medium | |
CN110083729B (en) | Image searching method and system | |
US20180046721A1 (en) | Systems and Methods for Automatic Customization of Content Filtering | |
CN107943792B (en) | Statement analysis method and device, terminal device and storage medium | |
CN113254643B (en) | Text classification method and device, electronic equipment and text classification program | |
US11803971B2 (en) | Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes | |
Le et al. | NII-HITACHI-UIT at TRECVID 2016. | |
CN107239564B (en) | Text label recommendation method based on supervision topic model | |
CN107292349A (en) | The zero sample classification method based on encyclopaedic knowledge semantically enhancement, device | |
CN110751027B (en) | Pedestrian re-identification method based on deep multi-instance learning | |
Estevez-Velarde et al. | AutoML strategy based on grammatical evolution: A case study about knowledge discovery from text | |
WO2022148108A1 (en) | Systems, devices and methods for distributed hierarchical video analysis | |
CN110008365A (en) | A kind of image processing method, device, equipment and readable storage medium storing program for executing | |
CN113704623A (en) | Data recommendation method, device, equipment and storage medium | |
CN111708890A (en) | Search term determining method and related device | |
CN115048504A (en) | Information pushing method and device, computer equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |