CN115617944A

CN115617944A - Content recommendation method and device, storage medium and electronic equipment

Info

Publication number: CN115617944A
Application number: CN202211243259.XA
Authority: CN
Inventors: 穆学锋; 谈雪娇; 徐若易
Original assignee: Hangzhou Netease Cloud Music Technology Co Ltd
Current assignee: Hangzhou Netease Cloud Music Technology Co Ltd
Priority date: 2022-10-11
Filing date: 2022-10-11
Publication date: 2023-01-17

Abstract

The present disclosure relates to the field of computer technologies, and in particular, to a content recommendation method, a content recommendation apparatus, a storage medium, and an electronic device. The content recommendation method comprises the following steps: acquiring published original content of a target user sending a request; acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content; recalling original content of each first user according to the target keyword set and the target semantic vector; calculating a composite score between the first user original content and the target user original content; and according to the comprehensive score, determining original content of the user to be recommended from the original content of each first user and recommending the original content of the user to be recommended to the target user. The method and the system can recommend the user original content with the highest score and capable of expressing the sentiment to the user so as to improve the social ability of the user.

Description

Content recommendation method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a content recommendation method, a content recommendation apparatus, a storage medium, and an electronic device.

Background

Currently, when a User has sentiment and desire lyric in the music playing process of a music platform, the User is required to edit and share the original Content (UGC) of the User, so as to find the sentiment. However, the user edits the original content of the user by himself/herself, which is often too single to show the social ability of the user. Therefore, a content recommendation method is needed to recommend the original content of the user with the sentiment to the user so as to improve the social ability of the user.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

The present disclosure is directed to a content recommendation method, device, storage medium, and electronic device, so as to overcome at least some of the problems of low data acquisition efficiency and low computational efficiency due to the limitations and disadvantages of the related art.

Additional features and advantages of the disclosure will be set forth in the detailed description which follows, or in part will be obvious from the description, or may be learned by practice of the disclosure.

According to a first aspect of the present disclosure, there is provided a content recommendation method including:

acquiring published target user original content of a target user who sends a request, wherein the request is used for requesting to acquire the user original content;

acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content;

recalling original content of each first user according to the target keyword set and the target semantic vector;

calculating a composite score between the first user original content and the target user original content;

and according to the comprehensive score, determining original content of the user to be recommended from the original content of each first user and recommending the original content of the user to be recommended to the target user.

In an exemplary embodiment of the present disclosure, the recalling each first user original content according to the keyword set and the semantic vector comprises:

acquiring second text content in second user original content, and generating a second keyword set and a second semantic vector according to the second text content, wherein the second user original content is other user original content except the target user original content;

calculating keyword matching scores of the target keyword set and the second keyword set;

calculating semantic matching scores of the target semantic vector and the second semantic vector;

recalling the original content of the first user according to the keyword matching score and the semantic matching score; the first user original content is user original content with the keyword matching score being larger than or equal to a first preset threshold value in the second user original content or user original content with the semantic matching score being larger than or equal to a second preset threshold value in the second user original content.

In an exemplary embodiment of the present disclosure, the calculating the keyword match scores of the target keyword set and the second keyword set comprises:

acquiring the same keywords in the target keyword set and the second keyword set;

obtaining the number of the same keywords and/or obtaining the weight of the same keywords;

and calculating the keyword matching scores of the target keyword set and the second keyword set according to the number of the same keywords and/or the weights of the same keywords.

In an exemplary embodiment of the present disclosure, the calculating the semantic matching score of the target semantic vector and the second semantic vector includes:

calculating the distance between the target semantic vector and the second semantic vector;

and determining semantic matching scores of the target semantic vector and the second semantic vector according to the distance between the target semantic vector and the second semantic vector.

In an exemplary embodiment of the present disclosure, the method further comprises:

acquiring a target media object of the original content of the target user;

acquiring a second media object of the second user original content;

calculating a media object matching score for the target media object and the second media object;

recalling the first user original content according to the media object matching score.

In an exemplary embodiment of the present disclosure, the calculating a media object matching score of the target media object and the second media object comprises:

acquiring a target user behavior vector of the target media object;

obtaining a second user behavior vector of the second media object;

and calculating the similarity between the target media object and the second media object according to the target user behavior vector and the second user behavior vector, and taking the similarity as the matching score of the media object.

In an exemplary embodiment of the present disclosure, the recalling the first user creative content according to the media object matching score comprises:

determining designated user original content with the media object matching score greater than or equal to a third preset threshold from the second user original content;

recalling the designated user original content as the first user original content.

In an exemplary embodiment of the present disclosure, the generating a target keyword set and a target semantic vector according to the target text content includes:

acquiring an ID list of the target text, and inputting the ID list into a Named Entity Recognition (NER) model to acquire a probability matrix of the target text;

determining each target keyword in the target text according to the probability matrix and generating the target keyword set;

acquiring a word vector and a weight value of the target text content;

and performing weighted operation on each word vector according to the weight value to obtain the target semantic vector.

In an exemplary embodiment of the present disclosure, the calculating a composite score between the first user originated content and the target user originated content comprises:

determining a keyword matching score and a semantic matching score between the first user original content and the original user original content;

and determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score and the semantic matching score.

In an exemplary embodiment of the present disclosure, the calculating a composite score between the first user original content and the target user original content further comprises:

determining a media object match score between the first user original content and the target user original content;

the step of determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score and the semantic matching score further comprises the following steps:

and determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score, the semantic matching score and the media object matching score.

acquiring a consumption preference value of the target user in a preset time period and a sharing heat value of the original content of the first user;

determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, and the media object matching score, further comprising:

determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value and the sharing heat value.

In an exemplary embodiment of the present disclosure, the obtaining of the consumption preference value of the target user within a preset time period and the sharing popularity value of the original content of the first user includes:

acquiring a first number of times of a first operation of the on-site user on the first user original content within the preset time period, wherein the first operation comprises at least one of the following operations: play, like and comment;

determining a consumption preference value of the target user in the preset time period according to the first time number;

acquiring a second time of performing a second operation on the original content of the first user, wherein the second operation comprises a forwarding operation and/or a sharing operation;

and calculating the sharing heat value of the original content of the first user according to the second times.

In an exemplary embodiment of the present disclosure, the determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the share popularity value comprises:

respectively performing feature bucket classification on the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value and the sharing heat value to obtain a discrete value corresponding to the keyword matching score, a discrete value corresponding to the semantic matching score, a discrete value corresponding to the media object matching score, a discrete value corresponding to the consumption preference value and a discrete value corresponding to the sharing heat value;

respectively converting the discrete value corresponding to the keyword matching score, the discrete value corresponding to the semantic matching score, the discrete value corresponding to the media object matching score, the discrete value corresponding to the consumption preference value and the discrete value corresponding to the sharing heat value into a keyword matching vector, a semantic matching vector, a media object matching vector, a consumption preference value vector and a sharing heat value vector;

performing feature extraction on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector and the sharing heat value vector by using a depth model to obtain a first vector;

performing feature cross operation on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector and the sharing heat value vector by using a cross model to obtain a second vector;

splicing the first vector and the second vector to obtain a splicing processing result;

and carrying out sigmoid operation on the splicing processing result after the processing of the full connection layer to obtain an operation result, and taking the operation result as the comprehensive score.

In an exemplary embodiment of the disclosure, the determining, according to the composite score, user original content to be recommended from the respective first user original contents includes:

screening the original content of the first user according to the comprehensive score and preset conditions to obtain the screened original content of the first user;

and taking the screened original content of the first user as the original content of the user to be recommended.

According to a second aspect of the present disclosure, there is provided a content recommendation apparatus including:

the system comprises a user original content acquisition module, a content distribution module and a content distribution module, wherein the user original content acquisition module is used for acquiring published target user original content of a target user sending a request, and the request is used for requesting to acquire the user original content;

the keyword generation module is used for acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content;

the user original content recalling module is used for recalling each first user original content according to the target keyword set and the target semantic vector;

a comprehensive score calculating module for calculating a comprehensive score between the original content of the first user and the original content of the target user;

and the user original content recommending module is used for determining original contents of the users to be recommended from the original contents of the first users according to the comprehensive scores and recommending the original contents of the users to be recommended to the target users.

In an exemplary embodiment of the present disclosure, the user-originated-content recall module includes:

the keyword set generating unit is used for acquiring second text content in the original content of a second user and generating a second keyword set and a second semantic vector according to the second text content, wherein the original content of the second user is original content of other users except the original content of the target user;

a keyword matching score calculating unit for calculating keyword matching scores of the target keyword set and the second keyword set;

a semantic matching score calculating unit for calculating semantic matching scores of the target semantic vector and the second semantic vector;

the user original content recalling unit is used for recalling the first user original content according to the keyword matching score and the semantic matching score; the first user original content is user original content with the keyword matching score being larger than or equal to a first preset threshold value in the second user original content or user original content with the semantic matching score being larger than or equal to a second preset threshold value in the second user original content.

In an exemplary embodiment of the present disclosure, the keyword matching score calculating unit includes:

a keyword acquisition unit, configured to acquire the same keyword in the target keyword set and the second keyword set;

a keyword number obtaining unit, configured to obtain the number of the same keywords and/or obtain weights of the same keywords;

and the keyword matching score determining unit is used for calculating the keyword matching scores of the target keyword set and the second keyword set according to the number of the same keywords and/or the weights of the same keywords.

In an exemplary embodiment of the present disclosure, the semantic matching score calculating unit includes:

a distance calculation unit for calculating a distance between the target semantic vector and the second semantic vector;

and the semantic matching score determining unit is used for determining the semantic matching scores of the target semantic vector and the second semantic vector according to the distance between the target semantic vector and the second semantic vector.

In an exemplary embodiment of the present disclosure, the user original content recall module further includes:

the target media object acquisition unit is used for acquiring a target media object of the original content of the target user;

the second media object acquisition unit is used for acquiring a second media object of the original content of the second user;

a media object matching score calculating unit for calculating a media object matching score of the target media object and the second media object;

the user original content recalling unit is further used for recalling the first user original content according to the media object matching score.

In an exemplary embodiment of the present disclosure, the media object matching score calculating unit includes:

a target user behavior vector obtaining unit, configured to obtain a target user behavior vector of the target media object;

a second user behavior vector obtaining unit, configured to obtain a second user behavior vector of the second media object;

and the media object similarity calculation unit is used for calculating the similarity between the target media object and the second media object according to the target user behavior vector and the second user behavior vector, and taking the similarity as the media object matching score.

In an exemplary embodiment of the present disclosure, the user-originated-content recall unit includes:

the designated user original content determining unit is used for determining designated user original content of which the media object matching score is greater than or equal to a third preset threshold value from the second user original content;

a first user original content recall unit, configured to, in an exemplary embodiment of the present disclosure, the generating a target keyword set and a target semantic vector according to the target text content includes:

an ID list acquisition unit, configured to acquire an ID list of the target text, and input the ID list into a Named Entity Recognition (NER) model to acquire a probability matrix of the target text;

the target keyword determining unit is used for determining each target keyword in the target text according to the probability matrix and generating the target keyword set;

a word vector acquiring unit, configured to acquire a word vector and a weight value of the target text content;

and the target semantic vector acquisition unit is used for performing weighted operation on each word vector according to the weight value to obtain the target semantic vector.

In an exemplary embodiment of the present disclosure, the composite score calculating module includes:

a matching score calculation unit for determining a keyword matching score and a semantic matching score between the first user original content and the original user original content;

and the first comprehensive score determining unit is used for determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score and the semantic matching score.

In an exemplary embodiment of the disclosure, the composite score calculating module further includes:

a media object matching score determining unit for determining a media object matching score between the first user original content and the target user original content;

the first composite score determining unit further includes:

a second composite score determining unit for determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score and the media object matching score.

the sharing heat value calculating unit is used for acquiring a consumption preference value of the target user in a preset time period and a sharing heat value of the original content of the first user;

the second composite score determining unit further includes:

a third composite score determining unit, configured to determine a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing popularity value.

In an exemplary embodiment of the present disclosure, the shared heat value calculating unit includes:

a first time input and acquisition unit, configured to acquire a first number of times of a first operation performed on the first user original content by a user in a station within the preset time period, where the first operation includes at least one of: play, like and comment;

a consumption preference value determining unit, configured to determine a consumption preference value of the target user within the preset time period according to the first number;

the second time number acquiring unit is used for acquiring a second time number of times that the original content of the first user is subjected to second operation, and the second operation comprises forwarding and/or sharing operation;

and the sharing heat value calculating unit is used for calculating the sharing heat value of the original content of the first user according to the second times.

In an exemplary embodiment of the present disclosure, the third composite score determining unit includes:

a discrete value obtaining unit, configured to perform feature bucketing on the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing heat value respectively to obtain a discrete value corresponding to the keyword matching score, a discrete value corresponding to the semantic matching score, a discrete value corresponding to the media object matching score, a discrete value corresponding to the consumption preference value, and a discrete value corresponding to the sharing heat value;

a vector obtaining unit, configured to convert a discrete value corresponding to the keyword matching score, a discrete value corresponding to the semantic matching score, a discrete value corresponding to the media object matching score, a discrete value corresponding to the consumption preference value, and a discrete value corresponding to the sharing popularity value into a keyword matching vector, a semantic matching vector, a media object matching vector, a consumption preference value vector, and a sharing popularity value vector, respectively;

a feature extraction unit, configured to perform feature extraction on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector, and the sharing popularity value vector by using a depth model to obtain a first vector;

the cross operation unit is used for performing feature cross operation on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector and the sharing heat value vector by using a cross model to obtain a second vector;

the splicing processing unit is used for splicing the first vector and the second vector to obtain a splicing processing result;

and the sigmoid operation unit is used for carrying out sigmoid operation on the splicing processing result after the splicing processing result is processed by a full connection layer to obtain an operation result, and taking the operation result as the comprehensive score.

In an exemplary embodiment of the present disclosure, the user-originated-content recommending module includes:

the user original content screening unit is used for screening the first user original content according to the comprehensive score and preset conditions to obtain the screened first user original content;

and the to-be-recommended user original content determining unit is used for taking the screened first user original content as the to-be-recommended user original content.

According to a third aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any one of the first aspects.

According to a fourth aspect of the present disclosure, there is provided an electronic apparatus comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the steps of the method of any one of the first aspect via execution of the executable instructions.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in summary, in the method provided by the present disclosure, the published original content of the target user who sends the request is obtained, where the request is used to request to obtain the original content of the user; acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content; recalling the original content of each first user according to the target keyword set and the target semantic vector; calculating a composite score between the first user original content and the target user original content; according to the comprehensive score, original contents of the users to be recommended are determined from the original contents of the first users, the original contents of the users to be recommended are recommended to the target users, the original contents of the users with the highest score and capable of expressing the sentiment are recommended to the users, and therefore the social ability of the users is improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

FIG. 1 schematically illustrates a flow chart of a method of content recommendation in an exemplary embodiment of the disclosure;

FIG. 2 schematically illustrates a block diagram of a content recommendation system in an exemplary embodiment of the disclosure;

FIG. 3 schematically illustrates a flow chart of a semantic vector retrieval method in an exemplary embodiment of the disclosure;

FIG. 4 schematically illustrates a flowchart of a method for recalling user original content in an exemplary embodiment of the disclosure;

FIG. 5 is a diagram that schematically illustrates a method for content recommendation, in an exemplary embodiment of the disclosure;

FIG. 6 schematically shows a block diagram of a content recommendation device in an exemplary embodiment of the disclosure;

FIG. 7 schematically illustrates a schematic diagram of a storage medium in an exemplary embodiment of the disclosure;

fig. 8 schematically illustrates a block diagram of an electronic device in an exemplary embodiment of the present disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the invention, and are not intended to limit the scope of the invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present invention may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

The data related to the present disclosure may be data authorized by a user or fully authorized by each party, and the collection, transmission, use, and the like of the data all meet the requirements of relevant national laws and regulations, and the embodiments/examples of the present disclosure may be combined with each other.

In view of the defects in the prior art, the exemplary embodiment provides a content recommendation method, which can recommend the user original content with the highest comprehensive score and capable of expressing the sentiment to the user, so as to improve the social ability of the user. Referring to fig. 1, the content recommendation method described above may include the steps of:

s11, obtaining published original content of a target user of a request sending, wherein the request is used for requesting to obtain the original content of the user;

s12, acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content;

s13, recalling the original content of each first user according to the target keyword set and the target semantic vector;

s14, calculating a comprehensive score between the original content of the first user and the original content of the target user;

s15, according to the comprehensive score, determining original contents of the users to be recommended from the original contents of the first users and recommending the original contents of the users to be recommended to the target users.

In summary, in the method provided by the present disclosure, by acquiring the published original content of the target user who sends the request, the request is used to request to acquire the original content of the target user; acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content; recalling the original content of each first user according to the target keyword set and the target semantic vector; calculating a composite score between the first user original content and the target user original content; according to the comprehensive score, original contents of the users to be recommended are determined from the original contents of the first users, the original contents of the users to be recommended are recommended to the target users, the original contents of the users with the highest score can be recommended to the users, and the original contents of the users with the sentiment can be expressed, so that the social ability of the users is improved.

Hereinafter, each step in the content recommendation method in the present exemplary embodiment will be described in more detail with reference to the drawings and examples.

In step S11, the published target user original content of the target user who issued the request for obtaining the user original content is obtained.

In an exemplary embodiment of the present disclosure, referring to the system architecture shown in fig. 2, may include: the system comprises a user side mobile terminal device 201, a user side intelligent terminal device 204, a server 203 and the like. The user side mobile terminal 201, the user side intelligent terminal 204 and the server 203 can all perform data transmission through the network 202. The network may include various connection types, such as wired communication links, wireless communication links, and so forth. The content recommendation method can be executed on a server side, a terminal device on a user side, or executed by the terminal device on the user side and the server side in a cooperation manner. Taking the method described above as an example, the server may obtain the published original content of the target user who sends the request, where the request is used to request to obtain the original content of the user; acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content; recalling the original content of each first user according to the target keyword set and the target semantic vector; calculating a composite score between the first user original content and the target user original content; and according to the comprehensive score, determining original content of the user to be recommended from the original content of each first user, recommending the original content of the user to be recommended to the terminal equipment used by the target user, and displaying the original content of the user to be recommended to the user by the terminal equipment. In an exemplary embodiment of the present disclosure, the user original content includes content that may include text, media objects, and pictures, and the media objects include media objects such as music media and video media, and the user original content and the media objects to be recommended are not particularly limited herein.

In step S12, a target text content in the original content of the target user is obtained, and a target keyword set and a target semantic vector are generated according to the target text content.

In an exemplary embodiment of the disclosure, the generating a target keyword set and a target semantic vector according to the target text content includes:

acquiring an Identity Document (ID) identification information list of the target text content, and inputting the ID list into a Named Entity Recognition (NER) model to acquire a probability matrix of the target text; determining each target keyword in the target text content according to the probability matrix and generating the target keyword set; acquiring a word vector and a weight value of the target text content; and performing weighted operation on each word vector according to the weight value to obtain the target semantic vector.

Illustratively, the probability matrix of the target text content includes probabilities of a plurality of keywords to be determined. For example, when the target text content is "you meet are me little fortunes", an ID list of the target text content is obtained, the ID list is input into the NER model, and the obtained probability matrix may include probabilities of a plurality of keywords to be determined, such as keywords to be determined, "meet", "see you", "you are" "me", "little fortunes", and the like, and are 0.43, 0.21, 0.27, 0.25, 0.38, respectively, "see" appears in "meet" and "see you" two keywords to be determined, and the probability of "meet" is greater than "see", then "meet" is the keyword of the target text. In addition, the probability of the keyword to be determined being "little true" is greater than that of other keywords to be determined, and the "little true" is also the keyword of the target text. Further, after determining each target keyword of the target text content, generating a target keyword set of the target text content.

In an exemplary embodiment of the present disclosure, the training set of the NER model includes a general data set and a data set of user-originated content of the media. By including the NER model trained with the training set of data sets and data sets of user-originated content of the media, not only the generic data sets, but also the user-originated content of the media can be identified.

In an exemplary embodiment of the present disclosure, the NER model employs a Global pointer Global Point model based on Bidirectional Encoder Representation from transforms (Bert). The Bert model is a very large-scale semantic pre-training model for Google open source, and comprises more than 200 languages. The Global Pointer model utilizes the Global normalization idea to identify the named entities, can identify the nested entities and the non-nested entities indiscriminately, has the accuracy similar to a conditional random field when identifying the non-nested entities, and can accurately identify the non-nested entities; the accuracy in identifying nested entities is also high. In addition, the training of the Global Pointer model does not need to recursively calculate denominators like a conditional random field, dynamic programming is not needed when an entity is identified, and the dynamic programming is completely parallel, so that the time complexity is far lower than that of the conditional random field under an ideal condition. In summary, the NER model based on the Global Point model of Bert can identify entities in various voices, and can accurately and quickly identify nested entities and non-nested entities in various voices.

In an exemplary embodiment of the disclosure, a word segmentation technique may be applied to segment the target text content to obtain a plurality of words. For each Word in a plurality of words, generating a multi-dimensional Word vector of the Word, such as a 128-dimensional Word vector, by using a Word-vector machine Word2vec, and calculating a weight value of the Word by using a term frequency-inverse text frequency index (TF-IDF) for each Word; and weighting vectors of all words in one target text content by using the weight, wherein the obtained semantic vector of the target text content comprises double expressions of text word information of the target text content and semantic information of a sentence.

Word2vec is a group of correlation models used to generate Word vectors. These models are shallow, two-layer neural networks trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the bag-of-words model in word2 vec. After training is complete, the word2vec model may be used to map each word to a vector. TF-IDF is a commonly used weighting technique for information retrieval and data mining to evaluate the importance of a word to one of a set of documents or a corpus of documents. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. Various forms of TF-IDF weighting are often applied by search engines as a measure or rating of the degree of relevance between a document and a user query. If a word or phrase appears in an article with a high frequency TF and rarely appears in other articles, the word or phrase is considered to have a good classification capability and is suitable for classification. TF in TF-IDF represents the frequency with which terms appear in document d. The main idea of IDF is: if the documents containing the entry t are fewer, that is, the smaller n is, the larger IDF is, the entry t has good category distinguishing capability.

How to obtain the target semantic vector of the target text content is described below with reference to fig. 3. For example, as shown in fig. 3, when the target text content is "raining today, the examination is not good, and the mood is bad", a word segmentation technology is applied to segment the target text content to obtain a word set, where the word set includes a plurality of words "today", "raining", "examination", "losing" and "mood", "bad", "good" and "good". Then Word2vec is adopted to obtain Word vectors of each Word in the Word set, such as today, raining, examination, loss of interest, mood, vintage, transparency and Word vector set. For example, the word vector of "today" in the word vector set is [0.11,0.02,0.39 ] \ 8230; \ 8230; 0.23], "lee" word vector is [0.87, -0.3,0.58 ] \ 8230; \ 8230; 0.24]. Meanwhile, TF-IDF is adopted to obtain TF-IDF weight of each word in the word set, wherein the TF-IDF is used for today, raining, taking a test, losing interest, mood, vintage, transparency and the like. For example, TF-IDF weights for "today", "rainy", "test", "loss", "mood", "vintage", "transparent" and "filled" are 0.12, 0.21, 0.57, 0.62, 0.33, 0.36, 0.18 and 0.01, respectively. After obtaining the word vector of each word in the words of "today", "raining", "examination", "loss", "mood", "vintage", "perspective" and "today", "raining", "examination", "loss", "mood", "vintage", "perspective", and the TF-IDF weight of each word in the words of "today", "raining", "examination", "loss", "mood", "vintage", "perspective", the word vector weighting operation is carried out to obtain the target semantic vector of the target text content. For example, the target semantic vector is [0.36,0.58, -0.17,0.36 \8230;, 0.38].

In step S13, recall the original content of each first user according to the target keyword set and the target semantic vector.

Based on the above, as shown in fig. 4, in an exemplary embodiment of the present disclosure, the recalling the original content of each first user according to the keyword set and the semantic vector includes:

s131, second text content in the original content of the second user is obtained, and a second keyword set and a second semantic vector are generated according to the second text content.

In an exemplary embodiment of the present disclosure, the second user original content is other user original content than the target user original content. For example, the second user original content may be other user original content issued by the target user, or may also be user original content issued by other users, and this embodiment is not limited in this embodiment.

Exemplarily, the second keyword set is a keyword set of the second text content, and the second semantic vector is a semantic vector of the second text content. The generation processes of the second keyword set and the second semantic vector are similar to the target keyword set and the target semantic vector, respectively, and this embodiment is not repeated here.

S132, calculating keyword matching scores of the target keyword set and the second keyword set;

in an exemplary embodiment of the present disclosure, the calculating the keyword matching scores of the target keyword set and the second keyword set includes:

acquiring the same keywords in the target keyword set and the second keyword set; acquiring the number of the same keywords and/or acquiring the weight of the same keywords; and calculating the keyword matching scores of the target keyword set and the second keyword set according to the number of the same keywords and/or the weights of the same keywords.

Illustratively, if the same keywords in the target keyword set and the second keyword set are keywords a, B, and C, and the number of the same keywords is 3, the keyword matching scores of the target keyword set and the second keyword set may be determined according to the number 3, where the keyword matching score is proportional to the number 3. Exemplarily, weights of the same keywords, i.e., the keywords a, the keywords B and the keywords C, may also be obtained, and a weighting operation may be performed according to the weights, so as to determine the keyword matching scores of the target keyword set and the second keyword set according to the weighting operation result, where the keyword matching score is in direct proportion to the weighting operation result. For example, the keyword matching scores of the target keyword set and the second keyword set may be determined jointly according to the number 3 and the result of the weighting operation, for example, the product of the number 3 and the result of the weighting operation may be calculated, and then the keyword matching scores of the target keyword set and the second keyword set may be determined according to the product of the number 3 and the result of the weighting operation, where the keyword matching score is proportional to the product of the number 3 and the result of the weighting operation.

S133, calculating semantic matching scores of the target semantic vector and the second semantic vector;

in an exemplary embodiment of the disclosure, the calculating the semantic matching score of the target semantic vector and the second semantic vector includes:

calculating the distance between the target semantic vector and the second semantic vector; and determining semantic matching scores of the target semantic vector and the second semantic vector according to the distance between the target semantic vector and the second semantic vector.

Illustratively, the semantic matching score of the target semantic vector and the second semantic vector is inversely proportional to the distance between the target semantic vector and the second semantic vector, the greater the distance between the target semantic vector and the second semantic vector, the smaller the semantic matching score between the target semantic vector and the second semantic vector, the smaller the distance between the target semantic vector and the second semantic vector, the greater the semantic matching score between the target semantic vector and the second semantic vector.

S134, recalling the original content of the first user according to the keyword matching score and the semantic matching score.

In an exemplary embodiment of the disclosure, the first user original content is user original content of which the keyword matching score is greater than or equal to a first preset threshold value in the second user original content or user original content of which the semantic matching score is greater than or equal to a second preset threshold value in the second user original content.

In an exemplary embodiment of the present disclosure, when the target user original content and the second user original content both include a media object, the first user original content may be recalled by matching a media object score of the target media object and the second media object. How to recall the first user original content according to the media object matching score of the target media object and the second media object is described below.

S16, acquiring a target media object of the original content of the target user;

s17, acquiring a second media object of the second user original content.

Illustratively, when the target user original content and the second user original content both include media objects, a target media object of the target user original content and a second media object of the second user original content are respectively obtained.

S18, calculating a media object matching score of the target media object and the second media object;

in an exemplary embodiment of the present disclosure, a target user behavior vector of the target media object is obtained; obtaining a second user behavior vector of the second media object; and calculating the similarity between the target media object and the second media object according to the target user behavior vector and the second user behavior vector, and taking the similarity as the media object matching score.

Specifically, the similarity between the target media object and the second media object may be calculated according to the following formula:

wherein similarity is the similarity between the target media object and the second media object, A is the user behavior vector of the target media object, B is the user behavior vector of the second media object, A _i For each user behavior data of said target media object, B _i -individual user behavior data for said second media object.

S19, recalling the original content of the first user according to the media object matching score.

In an exemplary embodiment of the present disclosure, determining, from the second user original content, a designated user original content of which the media object matching score is greater than or equal to a third preset threshold; recalling the designated user original content as the first user original content.

In step S14, a composite score between the first user original content and the target user original content is calculated.

In an exemplary embodiment of the present disclosure, if any one of the first user original content and the target user original content does not include a media object or neither the first user original content nor the target user original content includes a media object, determining a keyword matching score and a semantic matching score between the first user original content and the original user original content; and determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score and the semantic matching score.

In an exemplary embodiment of the present disclosure, if the first user original content and the target user original content both include a media object, determining a media object matching score between the first user original content and the target user original content; determining a comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score and the semantic matching score, and further comprising: determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score and the media object matching score.

In an exemplary embodiment of the disclosure, the calculating a composite score between the first user original content and the target user original content further includes:

acquiring a consumption preference value of the target user in a preset time period and a sharing heat value of the original content of the first user; determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, and the media object matching score, further comprising: determining a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value and the sharing heat value.

How to obtain the consumption preference value of the target user in the preset time period and the sharing heat value of the original content of the first user is described below.

In an exemplary embodiment of the disclosure, a first number of times of a first operation performed on the first user original content by an on-site user within the preset time period is acquired, where the first operation includes at least one of: play, like and comment; determining a consumption preference value of the target user in the preset time period according to the first time number; acquiring a second time of performing a second operation on the original content of the first user, wherein the second operation comprises a forwarding operation and/or a sharing operation; and calculating the sharing heat value of the original content of the first user according to the second times.

For example, the user in the station may be any user, may be a target user, or may be another user besides the target user, and this embodiment is not limited in this respect. The preset time period may be approximately three days, approximately one week, or other time periods, and the embodiment is not limited herein.

Illustratively, the consumption preference value is proportional to the first number, the larger the consumption preference value, and the smaller the first number, the smaller the consumption preference value; the sharing heat value is in direct proportion to the second number, the larger the second number is, the larger the sharing heat value is, and the smaller the second number is, the smaller the sharing heat value is.

How to determine a composite score between the first user original content and the target user original content according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing popularity value is described below.

In an exemplary embodiment of the present disclosure, feature buckets are respectively performed on the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing heat value to obtain a discrete value corresponding to the keyword matching score, a discrete value corresponding to the semantic matching score, a discrete value corresponding to the media object matching score, a discrete value corresponding to the consumption preference value, and a discrete value corresponding to the sharing heat value; respectively converting the discrete value corresponding to the keyword matching score, the discrete value corresponding to the semantic matching score, the discrete value corresponding to the media object matching score, the discrete value corresponding to the consumption preference value and the discrete value corresponding to the sharing heat value into a keyword matching vector, a semantic matching vector, a media object matching vector, a consumption preference value vector and a sharing heat value vector; performing feature extraction on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector and the sharing heat value vector by using a depth model to obtain a first vector; performing feature cross operation on the keyword matching vector, the semantic matching vector, the media object matching vector, the consumption preference value vector and the sharing heat value vector by using a cross model to obtain a second vector; splicing the first vector and the second vector to obtain a splicing processing result; and carrying out sigmoid operation on the splicing processing result after the processing of the full connection layer to obtain an operation result, and taking the operation result as the comprehensive score.

Illustratively, the depth model may be any neural network model based on deep learning, the cross model may be a (Logistic Regression, LR) Logistic Regression model, a factorization Machine (MF) model, or another cross model, and this embodiment is not limited in particular.

In step S15, according to the comprehensive score, determining original content of the user to be recommended from the original content of each first user, and recommending the original content of the user to be recommended to the target user.

In an exemplary embodiment of the disclosure, the original content of the first user is screened according to the comprehensive score and preset conditions, so as to obtain the screened original content of the first user; and taking the screened original content of the first user as the original content of the user to be recommended.

Illustratively, the preset condition includes at least one of: the method comprises the steps of integrating N original contents with front scores, original contents of users which are not recommended to a target user within a historical preset time length, M original contents of users with front scores issued by the same user and original contents of users corresponding to member information of the target user. Wherein N is greater than M.

For example, 10 first user original contents which bring the comprehensive score forward may be determined from the respective first user original contents; determining third user original contents pushed to the target user within nearly 2 days from the top 10 first user original contents with the comprehensive score; when the target user is a non-member, determining fourth user original content from the first user original content with the top 10 comprehensive scores; wherein, the user who releases the original content of the fourth user is a member; determining five user original contents from the 10 first user original contents with the front comprehensive score, wherein the fifth user original contents are 2 user original contents with the highest score issued by the same user in the 10 first user original contents with the front comprehensive score, and M is smaller than N; and finally, determining other user original contents except the third user original content and the fourth user original content from the 10 first user original contents with the top comprehensive score, and taking the other user original contents and the fifth user original contents as the user original contents to be recommended.

It should be noted that, in this embodiment, the preset condition is only described by way of example, and the preset condition may also be adjusted according to an actual situation, and the preset condition is not specifically limited in this embodiment.

The content recommendation method of the present disclosure is described below with reference to the embodiment of fig. 5. For example, as shown in fig. 5, in step S501, the publishing user original content of the user is received. In step S502, the user-originated content published by the user is generated from a semantic vector based on word2vec + word weights. In step S503, the keyword extraction model based on Bert + Global Point extracts the keywords in the user original content issued by the user. In step S504, semantic vector retrieval is carried out according to the semantic vector so as to carry out text semantic matching recall on the original content of the user issued by the user; in step S505, a keyword search is performed according to the keywords, so as to recall the original content of the user issued by the user through keyword matching. In addition, if the user original content released by the user also includes the media object, step S506 is executed to obtain user behavior data of the media object, and step S507 is executed to perform media similarity indexing according to the user behavior data, so as to perform media matching recall on the user original content released by the user. Further, in step S508, the user original content of the text semantic matching recall, the user original content of the keyword matching recall, and the user original content of the media matching recall are input into a pre-trained deep model to perform summary sorting on the user original content of the text semantic matching recall, the user original content of the keyword matching recall, and the user original content of the media matching recall, and then S509 is executed to modify and output (for example, filter according to preset conditions) the summary sorting result to the user. Therefore, the original content of the user with the highest score and capable of expressing the sentiment can be recommended to the user so as to improve the social ability of the user.

In summary, by acquiring the published target user original content of the target user who sends the request, the request is used for requesting to acquire the user original content; acquiring target text content in the original content of the target user and generating a target keyword set and a target semantic vector according to the target text content; recalling original content of each first user according to the target keyword set and the target semantic vector; calculating a composite score between the first user original content and the target user original content; according to the comprehensive score, original content of the user to be recommended is determined from the original content of each first user, the original content of the user to be recommended is recommended to the target user, the original content of the user with the highest score and capable of expressing the sentiment can be recommended to the user, and therefore the social ability of the user is improved.

Having introduced the content recommendation method of the exemplary embodiment of the present invention, a content recommendation apparatus of the exemplary embodiment of the present invention is described next with reference to fig. 6.

Referring to fig. 6, a content recommendation device 60 according to an exemplary embodiment of the present invention may include: a user original content acquisition module 601, a keyword generation module 602, a user original content recall module 603, a comprehensive score calculation module 604 and a user original content recommendation module 605; wherein:

a user original content obtaining module 601, configured to obtain published target user original content of a target user who sends a request, where the request is used to request to obtain the user original content;

a keyword generation module 602, configured to obtain target text content in the original content of the target user and generate a target keyword set and a target semantic vector according to the target text content;

a user original content recalling module 603, configured to recall each first user original content according to the target keyword set and the target semantic vector;

a composite score calculating module 604 for calculating a composite score between the first user original content and the target user original content;

and the user original content recommending module 605 is configured to determine original contents of the users to be recommended from the original contents of the first users according to the comprehensive score, and recommend the original contents of the users to be recommended to the target user.

the keyword set generating unit is used for acquiring second text content in second user original content and generating a second keyword set and a second semantic vector according to the second text content, wherein the second user original content is other user original content except the target user original content;

a keyword matching score calculation unit for calculating keyword matching scores of the target keyword set and the second keyword set;

In an exemplary embodiment of the present disclosure, the user-originated-content recall module further includes:

the specified user original content determining unit is used for determining the specified user original content of which the media object matching score is greater than or equal to a third preset threshold value from the second user original content;

a first user original content recall unit, configured to, in an exemplary embodiment of the present disclosure, the generating a target keyword set and a target semantic vector from the target text content includes:

an ID list obtaining unit, configured to obtain an ID list of the target text, and input the ID list into a Named Entity Recognition (NER) model to obtain a probability matrix of the target text;

a first comprehensive score determining unit, configured to determine a comprehensive score between the first user original content and the target user original content according to the keyword matching score and the semantic matching score.

the first composite score determining unit further includes:

and the second comprehensive score determining unit is used for determining the comprehensive score between the original content of the first user and the original content of the target user according to the keyword matching score, the semantic matching score and the media object matching score.

the second composite score determining unit further includes:

a third composite score determining unit, configured to determine a composite score between the original content of the first user and the original content of the target user according to the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing popularity value.

the consumption preference value determining unit is used for determining the consumption preference value of the target user in the preset time period according to the first time;

a discrete value obtaining unit, configured to perform feature binning on the keyword matching score, the semantic matching score, the media object matching score, the consumption preference value, and the sharing heat value respectively to obtain a discrete value corresponding to the keyword matching score, a discrete value corresponding to the semantic matching score, a discrete value corresponding to the media object matching score, a discrete value corresponding to the consumption preference value, and a discrete value corresponding to the sharing heat value;

a vector obtaining unit, configured to convert the discrete value corresponding to the keyword matching score, the discrete value corresponding to the semantic matching score, the discrete value corresponding to the media object matching score, the discrete value corresponding to the consumption preference value, and the discrete value corresponding to the sharing heat value into a keyword matching vector, a semantic matching vector, a media object matching vector, a consumption preference value vector, and a sharing heat value vector, respectively;

Since each functional module of the content recommendation device in the embodiment of the present invention is the same as that in the embodiment of the content recommendation method in the embodiment of the present invention, it is not described herein again.

Having described the content recommendation method and the content recommendation apparatus according to the exemplary embodiments of the present invention, a storage medium according to an exemplary embodiment of the present invention will be described with reference to fig. 7.

Referring to fig. 7, a program product 700 for implementing the above method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Having described the storage medium of an exemplary embodiment of the present invention, next, an electronic device of an exemplary embodiment of the present invention will be described with reference to fig. 8.

The electronic device 80 shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.

As shown in fig. 8, the electronic device 80 is in the form of a general purpose computing device. The components of the electronic device 70 may include, but are not limited to: the at least one processing unit 810, the at least one memory unit 820, a bus 830 connecting different system components (including the memory unit 820 and the processing unit 810), and a display unit 840.

Wherein the storage unit stores program code that is executable by the processing unit 810 to cause the processing unit 810 to perform steps according to various exemplary embodiments of the present invention as described in the above section "exemplary methods" of the present specification. For example, the processing unit 810 may perform steps S11 to S15 as shown in fig. 1.

The memory unit 820 may include a volatile memory unit such as a random access memory unit (RAM) 8201 and/or a cache memory unit 8202, and may further include a read only memory unit (ROM) 8203. The storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which or some combination thereof may comprise an implementation of a network environment.

Bus 830 may include a data bus, an address bus, and a control bus.

The electronic device 80 may also communicate with one or more external devices 90 (e.g., keyboard, pointing device, bluetooth device, etc.) via an input/output (I/O) interface 850. The electronic device 80 further comprises a display unit 840 connected to the input/output (I/O) interface 850 for displaying. Also, the electronic device 800 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 860. As shown, a network adapter 860 communicates with the other modules of the electronic device 80 via the bus 830. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the electronic device 80, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

It should be noted that although several modules or sub-modules of the spatial data rendering system are mentioned in the above detailed description, such partitioning is merely exemplary and not mandatory. Indeed, the features and functions of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Further, while operations of the methods of the invention are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the invention have been described with reference to several particular embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The invention is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A content recommendation method, comprising:

2. The method of claim 1, wherein the recalling respective first user creative content from the set of keywords and the semantic vector comprises:

calculating a semantic matching score of the target semantic vector and the second semantic vector;

3. The method of claim 2, wherein the calculating the keyword match scores for the set of target keywords and the set of second keywords comprises:

4. The method of claim 2, wherein the calculating the semantic matching score for the target semantic vector and the second semantic vector comprises:

calculating a distance between the target semantic vector and the second semantic vector;

5. The method of claim 2, further comprising:

acquiring a target media object of the original content of the target user;

acquiring a second media object of the second user original content;

6. The method of claim 5, wherein calculating a media object match score for the target media object and the second media object comprises:

acquiring a target user behavior vector of the target media object;

obtaining a second user behavior vector of the second media object;

and calculating the similarity between the target media object and the second media object according to the target user behavior vector and the second user behavior vector, and taking the similarity as the media object matching score.

7. The method of claim 5, wherein the recalling the first user creative content according to the media object matching score comprises:

determining designated user original content with the media object matching score being greater than or equal to a third preset threshold from the second user original content;

and recalling the designated user original content as the first user original content.

8. A content recommendation apparatus characterized by comprising:

and the user original content recommending module is used for determining original content of the user to be recommended from the original content of each first user according to the comprehensive score and recommending the original content of the user to be recommended to the target user.

9. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.

10. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the steps of the method of any one of claims 1 to 7 via execution of the executable instructions.