WO2020155877A1 - 信息推荐 - Google Patents

信息推荐 Download PDF

Info

Publication number
WO2020155877A1
WO2020155877A1 PCT/CN2019/125054 CN2019125054W WO2020155877A1 WO 2020155877 A1 WO2020155877 A1 WO 2020155877A1 CN 2019125054 W CN2019125054 W CN 2019125054W WO 2020155877 A1 WO2020155877 A1 WO 2020155877A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
recommended
user data
user
topic
Prior art date
Application number
PCT/CN2019/125054
Other languages
English (en)
French (fr)
Inventor
陈文石
王强
卢文羊
李春阳
Original Assignee
北京三快在线科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京三快在线科技有限公司 filed Critical 北京三快在线科技有限公司
Publication of WO2020155877A1 publication Critical patent/WO2020155877A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • the present invention relates to the field of Internet technology, in particular to an information recommendation method, device, electronic equipment and readable storage medium.
  • O2O Online To Offline,/online to offline
  • the essence of this model is to make it easier for users and services to find each other. Users can choose the offline services they need online at any time; and by mining user portraits and business information, they can also recommend information to users , Thereby improving user experience and helping merchants find customers.
  • the core idea of the existing information recommendation scheme is to mine similar users or similar information to be recommended, and recommend through similar users or recommend similar information to be recommended to users. It can be seen that the existing information recommendation methods mostly focus on optimizing the recommendation algorithm, that is, how to match users and merchants more accurately, without paying attention to the user's actual situation. Therefore, the accuracy of the recommendation results determined based on the existing information recommendation methods There are deficiencies in diversity and diversity, and the attractiveness to users is low.
  • the present invention provides an information recommendation method, device, electronic equipment, and readable storage medium to partially or completely solve the above-mentioned problems related to the information recommendation process in the prior art.
  • an information recommendation method including:
  • an information recommendation device including:
  • An association degree determination module configured to determine an association degree value between the target user and each scenario theme based on the first user data of the target user;
  • a matching degree determining module configured to determine the matching degree value between the to-be-recommended information and the scene theme according to the second user data corresponding to each to-be-recommended information
  • the recommendation information matching module is configured to determine the information to be recommended matching the target user based on the correlation degree value and the matching degree value, and send the information to be recommended to the target user.
  • an electronic device including:
  • a processor a memory, and a computer program that is stored on the memory and can run on the processor, wherein the processor implements the aforementioned information recommendation method when the program is executed by the processor.
  • a computer-readable storage medium having a computer program stored thereon, which when executed by a processor implements the steps of the aforementioned information recommendation method disclosed in the embodiments of the present application.
  • the value of the association degree between the target user and each scene theme can be determined based on the first user data of the target user; and the second user data corresponding to each information to be recommended is determined
  • the matching degree value of the context topic based on the association degree value and the matching degree value, determine the information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • Fig. 1 shows a flowchart of the steps of an information recommendation method according to an embodiment of the present invention
  • FIG. 2 shows a flowchart of the steps of an information recommendation method according to an embodiment of the present invention
  • Fig. 3 shows a schematic structural diagram of an information recommendation device according to an embodiment of the present invention.
  • Fig. 4 shows a schematic structural diagram of an information recommendation device according to an embodiment of the present invention.
  • Fig. 5 schematically shows a block diagram of a computing processing device for executing the method according to the present application.
  • Fig. 6 schematically shows a storage unit for holding or carrying program codes for implementing the method according to the present application.
  • FIG. 1 a flowchart of the steps of an information recommendation method in an embodiment of the present invention is shown.
  • Step 110 Determine the degree of association between the target user and each context topic based on the first user data of the target user.
  • the first user data of the target user can be obtained, and based on the first user data, the correlation degree value of the corresponding target user with respect to each context topic is determined.
  • the first user data may include any available data related to the target user.
  • user profile data UGC (User Generated Content, user original content) data
  • PGC Professionally-generated Content
  • OGC Occupationally-generated Content, professional production content
  • user location data current POI (Point of Interest) data
  • POI data may include, but is not limited to, POI tags, POI UGC content, POI related articles, and so on.
  • the specific content contained in the first user data can be preset according to requirements, and in the embodiment of the present invention, the first user data can be obtained by any available method. Be qualified.
  • the scenario theme can also be pre-defined by any available method according to the needs.
  • the scenario theme can be set by the way defined by experts, or data mining can be carried out through a large amount of reference data to mine the scenario theme, and so on.
  • the way of expressing a situational theme may include a situational theme being expressed by at least one word, and so on.
  • a certain scene theme as the scene theme of a romantic date, or a certain scene theme as the scene theme of a dinner party, and so on.
  • any available method may be used to determine the degree of association of the corresponding target user to each scenario theme based on the first user data, which is not limited in the embodiment of the present invention.
  • the degree of matching between the first user data and each context topic may be used as the value of the degree of association between the corresponding target user and the corresponding context topic, and so on.
  • Step 120 Determine the matching degree value between the information to be recommended and the scene theme according to the second user data corresponding to each information to be recommended.
  • the second user data of different users for a certain piece of information to be recommended may characterize the corresponding piece of information to be recommended to a certain extent. For example, if the second user data corresponding to a piece of information to be recommended contains multiple users selecting the location corresponding to the information to be recommended as the appointment location, then it can be inferred that the corresponding information to be recommended is more closely matched with the theme of the appointment. high.
  • the corresponding information to be recommended and each information to be recommended can also be determined according to the second user data corresponding to each information to be recommended.
  • the matching degree value of the scene theme Specifically, the matching degree value between the to-be-recommended information and the scene theme may be determined based on the second user data corresponding to the to-be-recommended information in any available manner, which is not limited in the embodiment of the present invention.
  • the information to be recommended may be any information that can be recommended to users, for example, it may include but not limited to recommendation information for at least one item, recommendation information for at least one place, and recommendation information for at least one webpage. Recommended information, etc.
  • the specific information to be recommended can be set according to requirements, which is not limited in the embodiment of the present invention.
  • the second user data corresponding to each recommended information may include second user data of related users who perform operations such as purchasing, browsing, and sharing with respect to the corresponding recommended information.
  • the second user data may also include, but is not limited to, the aforementioned user portrait data, UGC data, PGC data, OGC data, user positioning data, POI data, and so on.
  • the specific data type included in the second user data can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • Step 130 Based on the association degree value and the matching degree value, determine information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • the aforementioned correlation degree value can represent the degree of correlation between the target user and each context topic
  • the matching degree can represent the degree of correlation between each piece of information to be recommended and each context topic.
  • the purpose of this program is to recommend corresponding information to the target user based on the context of the target user. Therefore, based on the association degree value and the matching degree value, the information to be recommended that matches the target user can be determined, and the corresponding information to be recommended can be sent to the corresponding target user.
  • the scene theme corresponding to the target user can be determined, and based on the matching degree value, the information to be recommended matching the corresponding scene theme can be determined, and then the information to be recommended matching the target user can be obtained; or in the implementation of the present invention
  • the determined information to be recommended can be sent to the target user by any available method, which is not limited in the embodiment of the present invention.
  • the value of the association degree between the target user and each scene theme can be determined based on the first user data of the target user; and the second user data corresponding to each information to be recommended is determined
  • the matching degree value of the context topic based on the association degree value and the matching degree value, determine the information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • FIG. 2 a flowchart of the steps of an information recommendation method in an embodiment of the present invention is shown.
  • Step 210 Perform data mining based on the third user data of the referenceable user, and extract a specific scenario theme.
  • data mining can be performed based on the third user data of the referable user to extract specific contextual topics.
  • the third user data of the reference user can specifically include any user data that can be obtained for contextual topic mining. For example, user data of users on platforms such as group buying and food delivery, etc.
  • the third user data may also include, but is not limited to, the aforementioned user portrait data, UGC data, PGC data, OGC data, user positioning data, POI data, and so on.
  • the specific data type included in the third user data can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • any available method may be used to perform data mining to extract a specific scenario theme, which is not limited in the embodiment of the present invention.
  • the step 210 may further include:
  • Sub-step 211 Perform vectorization processing on the third user data of the referenceable user to obtain a multi-dimensional word vector corresponding to the third user data.
  • the third user data of the reference user may be vectorized to obtain the multi-dimensional word vector corresponding to the third user data .
  • any available vector processing method may be used to vectorize the third user data of the reference user, which is not limited in the embodiment of the present invention.
  • the doc2vec model can be used to vectorize the third user data of the reference user, or the word2vec model can be used to vectorize the third user data of the reference user, and so on.
  • the word2vec model can include a skip-gram model, a continuous bag-of-word (CBOW) model, and so on.
  • the sub-step 211 may further include:
  • Sub-step 2111 perform word segmentation processing on the third user data of the referenceable user
  • the third user data of the reference user can be segmented.
  • any available word segmentation processing method can be used to perform the word segmentation processing on the third user data of the reference user.
  • the word segmentation processing is not limited in this embodiment of the present invention.
  • Sub-step 2112 removing invalid words in the third user data after word segmentation processing, and extracting characteristic words in the third user data, where the invalid words include at least one of stop words and high-frequency words;
  • invalid words can also be ignored when generating multidimensional vectors.
  • invalid words can be further removed, and characteristic words can be extracted.
  • the invalid words can include but are not limited to Stop Words, high-frequency words, and so on.
  • stop words can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • stop words can be set by referring to existing stop words lists such as Harbin Institute of Technology stop words lists, Baidu stop words lists, etc.
  • words that are not helpful or meaningless to the business can be specially sorted out according to business needs.
  • stop words can include stop "sentences", such as "This user did not comment.” for e-commerce companies can also be set as stop words.
  • the remaining participles can be directly used as the characteristic words; or, each participle obtained after the invalid words can be further filtered, and then The characteristic words are obtained, and the specific filtering strategy can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • the word frequency range corresponding to high-frequency words can also be preset according to requirements, which is not limited by the embodiment of the present invention.
  • a multi-dimensional word vector of the third user data is constructed based on the characteristic word.
  • a multi-dimensional word vector that can refer to the third user data of the user can be constructed based on the feature words.
  • the form of the obtained multi-dimensional word vector can be different.
  • the multi-dimensional word vector may be in the form of one-hot encoding, or in the form of TF-IDF (term frequency-inverse document frequency, term frequency-inverse document frequency), and so on.
  • TF-IDF term frequency-inverse document frequency, term frequency-inverse document frequency
  • Sub-step 212 based on the multi-dimensional word vector, obtain the scene topic through a topic model.
  • the context topic can be obtained through topic model mining based on the multi-dimensional word vector.
  • the topic model may specifically be any available topic model, such as LDA (Latent Dirichlet Allocation, document topic generation model) topic model, Sentence LDA topic model, Copula LDA topic model, and so on.
  • LDA Topic Dirichlet Allocation, document topic generation model
  • Sentence LDA topic model Sentence LDA topic model
  • Copula LDA topic model and so on.
  • Step 220 Define a specific context theme according to the preset context judgment condition.
  • specific scene topics can also be defined according to preset scene judgment conditions.
  • the situation judgment conditions can be preset according to requirements, which are not limited in the embodiment of the present invention. For example, you can consider the user's actual intentions and scenes, set different situation judgment conditions, and define the romantic dating situation, define the dinner situation, and so on.
  • the scene theme can be obtained based on step 210 and/or step 220, which is not limited in the embodiment of the present invention.
  • the contextual topic is characterized by contextual topic words, and/or topic related words under the contextual topic word category.
  • each context topic in order to accurately characterize each context topic, can be set to be represented by at least one context topic word, and/or at least one topic related word under the corresponding context topic word category.
  • the situation topic word may be "dating”, and the category of the situation topic word category includes topic related words such as "romance”, “candlelight dinner”, “flowers”, and so on.
  • Step 230 Obtain the feature value of the first user data for each context topic word according to the probability of the context related words contained in each piece of first user data of the target user under each context topic word.
  • the above-mentioned multi-dimensional word vectors are input into the LDA model, and then a word-topic matrix can be obtained.
  • the word in it can be understood as the topic related words in the embodiment of the present invention, and topic It can be understood as the contextual subject words in the present invention.
  • the probability of each context related word under the corresponding context topic word can also be obtained.
  • each piece of first user data of the target user may be used as a unit, and the probability of the context-related words contained in each piece of first user data under each context topic word may be used as a unit. In this way, the feature value of each piece of first user data for each scene topic word is obtained.
  • the corresponding relationship between the feature value of a certain first user data for a certain context topic word and the probability of the context related words contained in the corresponding first user data under the corresponding context topic word can be predicted according to requirements. It is assumed that the embodiment of the present invention does not limit this.
  • the step 230 may further include:
  • the first user data contains more content, and there may be at least one topic-related word in it, and other irrelevant words may also be included, then in the embodiment of the present invention, in order to determine that each piece of first user data corresponds to For the feature values of topic words in the context, you can first extract topic-related words in each piece of first user data.
  • each piece of first user data may also be preprocessed.
  • the preprocessing can include word segmentation, invalid word removal, and so on.
  • the invalid words can also include the aforementioned high-frequency words and stop words, and so on.
  • Sub-step 232 for each context topic word, sum the probabilities of the topic related words under the context topic word to obtain the feature value of the first user data for the context topic word.
  • the corresponding first user data After obtaining the context-related words contained in the first user data, in order to determine the feature value of the first user data corresponding to each context topic, for each context topic word, the corresponding first user data The probabilities of the extracted topic-related words under the corresponding contextual topic words are summed to obtain the feature value of the corresponding first user data for the corresponding contextual topic words.
  • the context topic words corresponding to the extracted context-related words can be known, and then the corresponding first user data corresponding Contextual keywords. And if a certain first user data does not include all context related words under a certain context topic word, then it means that the feature value of the first user data for the context topic word is zero. Therefore, in the embodiment of the present invention, the probabilities of the extracted topic related words corresponding to the corresponding scene topic words under the corresponding context topic words may be summed only for each context topic word corresponding to the corresponding first user data , To obtain the feature value of the corresponding first user data for the corresponding scenario topic word.
  • the feature value can be calculated in the following way:
  • n is the number of context-related words under the category of context topic word k extracted from the first user data a;
  • f uikt represents the probability of context-related word t under context topic word k, and if context If the context-related word t is not included under the subject head k, then f uikt can be 0.
  • step 240 based on the feature value of the first user data under each context topic word, obtain the value of the degree of relevance of the target user to the context topic represented by the context topic word.
  • the feature value of each piece of first user data of the target user under each context topic word it may be further based on the feature value of all the first user data of the target user under each context topic word , To obtain the correlation degree value of the corresponding target user to each scene theme.
  • the corresponding relationship between the characteristic value and the association degree value can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • the target user's relevance value for a certain situation topic which is the average value of the feature value of the situation topic words of each first user data of the target user for the corresponding situation topic; or you can set the target user for a certain situation
  • the topic relevance value is a weighted average of the feature values of the context topic words of each piece of first user data of the target user for the corresponding context topic, and the weights can be preset according to requirements, and so on.
  • the step 240 may further include:
  • Sub-step 241, for each contextual topic word obtain the feature values and values of all first user data of the target user for the contextual topic word;
  • each piece of first user data of the target user can characterize his current environment to a certain extent. Therefore, in this embodiment of the present invention, in order to determine the current degree of association of the target user with each context topic, the feature values and values of all the first user data of the target user for the contextual topic word are respectively acquired.
  • context topic word k For example, for context topic word k, suppose that the feature value of first user data 1 to context topic word k is n1, the feature value of first user data 2 to context topic word k is n2, and first user data 3 is to context topic The feature value of word k is n3. At this time, the sum of the feature values of all the first user data of the target user for the context topic word k can be obtained n1+n2+n3.
  • the first user data acquisition condition can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • the first user data acquisition condition may be set to acquire first user data whose time difference between the publishing time and the current time is within a preset time difference, or to acquire first user data for a preset information type, and so on.
  • Sub-step 242 Obtain the ratio of the feature value and the value to the amount of all the first user data, and obtain the value of the degree of association of the target user with respect to the context topic represented by the context topic word.
  • the average value of the feature value of each first user data for the same context topic word can be used as its association with the corresponding context topic Degree value, then you can obtain the ratio of the feature value and value for a certain context topic word to the number of all the first user data, and obtain the target user's context topic represented by the context topic word Relevance value.
  • all the first user data of the target user has a feature value and value of n1+n2+n3 for the contextual topic word k, and all the first user data at this time specifically includes three first users Data, then it can be obtained that the target user's degree of relevance to the context topic represented by context topic word k is (n1+n2+n3)/3.
  • the correlation degree value includes a weighted summation of the short-term correlation degree value and/or the long-term correlation degree value; wherein the weight of the short-term correlation degree value is greater than that of the long-term correlation degree value. Weight.
  • Corresponding target users’ relevance to each scenario theme can also include a weighted summation of short-term relevance and/or long-term relevance.
  • the weights of short-term relevance and long-term relevance can be performed according to requirements. It is preset, which is not limited in this embodiment of the present invention. However, in practical applications, generally short-term interest can better characterize the current interest of the target user. Therefore, the weight of the short-term correlation degree value can be set to be greater than the weight of the long-term correlation degree value, and the weight of the short-term correlation degree value can also be set.
  • the sum of the weights to the long-term correlation degree value is 1.
  • the relevance value of the target user u to a certain situation topic can be expressed as:
  • ⁇ and ⁇ are the weights of the short-term correlation degree value and the long-term correlation degree value.
  • the short-term relevance value may be the relevance value corresponding to the first user data of the target user in consideration of a preset short period of time, usually a relatively short-term dynamic interest
  • the long-term interest may be at least one preset relatively Interest in a long historical period of time is a value determined mainly based on user attributes and long-term preferences. For example, the user is a newborn mother and will have potential interest in parent-child categories for a long time.
  • the specific types of user attributes corresponding to the long-term interests and the short-term association degree values of each preset shorter historical time period can be preset according to requirements, which are not limited in the embodiment of the present invention.
  • the short-term relevance value can be set to the relevance value corresponding to the first user data of the target user within a preset time period before the current time, which can be based on the user’s interest level in the current hour, the current day, or other relatively short periods of time
  • Long-term interests can be the interests set by the user’s basic attributes, such as young women’s preference for beauty, or young men’s preference for sports and fitness, etc., or the current user can repeatedly click to browse hot pot merchants, we can also judge the user Potentially interested in hot pot category.
  • the long-term relevance degree value may be directly set as the sum of the short-term relevance degree values in at least one preset historical time period.
  • Step 250 For each piece of information to be recommended, obtain each piece of second user data for each reference user for the information to be recommended.
  • the target user after determining the correlation degree value of the target user to each context topic, in order to recommend applicable to-be-recommended information to the target user, it is also necessary to determine the matching degree value of each to-be-recommended information and each context topic.
  • the referenceable user is the referenceable user corresponding to all the second user data mentioned above.
  • the second user data of the reference user u1 for the information to be recommended i is s1
  • the second user data of the reference user u2 for the information to be recommended i is s2
  • Step 260 For each contextual topic word, according to the feature value of the second user data for the contextual topic word, and the number of second user data corresponding to the to-be-recommended information, obtain the State the matching degree value of the situation topic represented by the situation topic word.
  • the corresponding relationship between the matching degree value and the feature value and the number of user data can be preset according to requirements, which is not limited in the embodiment of the present invention. For example, you can set the matching degree value of a certain information to be recommended to the context topic represented by a certain context topic word as the value of the feature value and value of all the second user data corresponding to the corresponding context topic word in the corresponding context topic word. The ratio of the total number of second user data corresponding to the information to be recommended.
  • the second user data s1 for the information to be recommended i has the feature value of the contextual topic word f uik , then at this time the information to be recommended i is related to the contextual topic represented by the contextual topic word k
  • the matching value is:
  • ⁇ u f uik indicates the feature value and value of all the second user data corresponding to the information to be recommended i for the corresponding context topic word k
  • indicates the quantity of all the second user data corresponding to the information i to be recommended.
  • the degree of matching between the information to be recommended and the theme of each scene can also be determined in advance, and since the second user data will be continuously updated, the time period can also be preset as a time interval , Periodically obtain and re-determine the matching degree of each to-be-recommended information with each scene theme with the second user data of the currently available reference user.
  • the M context topics with the highest relevance value can be selected at this time, and then only the degree of matching between each to-be-recommended information and the M context topics can be obtained. That is, it is not necessary to obtain the degree of matching between the information to be recommended and the topic of each scene.
  • Step 270 Perform normalization processing on the correlation degree value and the matching degree value.
  • the relevance degree value and the matching degree value may be normalized.
  • the specific normalization processing method can be preset according to requirements, which is not limited in the embodiment of the present invention.
  • Step 280 Obtain a similarity between the target user and the information to be recommended based on the association degree value and the matching degree value.
  • the information to be recommended that the target user is interested in can be further filtered from the information to be recommended.
  • the similarity between the target user and each piece of information to be recommended may be obtained based on the association degree value and the matching degree value.
  • the similarity between the target user and each piece of information to be recommended can be obtained by any available similarity determination method, which is not limited in the embodiment of the present invention. For example, Euclidean distance (Euclidean Distance) similarity, Manhattan distance (Manhattan Distance) similarity, Minkowski distance (Minkowski distance) similarity, Cosine similarity (Cosine Similarity), etc.
  • the similarity includes cosine similarity.
  • the similarity between the target user u and the information to be recommended i is:
  • T i [T i1, T i2 ,T i3 ,...,T ik ],k ⁇ [1,K], where I uk represents the value of the relevance of the target user u to the k-th scene topic, and Tik represents the information to be recommended for the k-th scene
  • the matching degree value of the topic, K represents the total number of scene topics.
  • Step 290 Select a preset quantity of to-be-recommended information with the highest similarity to the target user as the recommended information of the target user, and send the recommended information to the target user.
  • the preset number of pieces of information to be recommended with the highest similarity to the target user can be selected as the recommended information of the target user, and the recommended information Sent to the target user.
  • the preset number can also be preset according to requirements, which is not limited in the embodiment of the present invention.
  • the user data includes at least one of user original content data and user portrait data.
  • Step 2110 Determine a target context theme that matches the target user according to the correlation degree value.
  • Step 2120 Display the information to be recommended according to the target scenario theme.
  • the step 2120 may further include: according to the target context topic, displaying recommendation information related to the target context topic in the to-be-recommended information, and the recommendation information includes recommendation At least one of reason, picture information, video information, and text information.
  • the corresponding information to be recommended can also be displayed in a personalized manner according to the context of the target user.
  • the target context theme that matches the target user can be determined according to the aforementioned determined association degree value, and further Display the information to be recommended according to the target scenario theme.
  • the part of the information to be recommended that is related to the subject of the target scenario can be displayed first, or if the target user’s current scenario is a wireless network environment, then the selected information to be recommended can be displayed through animation or high-definition pictures at this time, and If the target user is currently in an outdoor environment, the selected recommendation information can be displayed by voice at this time, and so on.
  • the recommended information related to the target context topic in the selected information to be recommended can be displayed, and the recommended information can include, but is not limited to, a recommendation reason that matches the target context topic, picture information, video information, text At least one of the information.
  • the value of the association degree between the target user and each scene theme can be determined based on the first user data of the target user; and the second user data corresponding to each information to be recommended is determined
  • the matching degree value of the context topic based on the association degree value and the matching degree value, determine the information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • data mining can be performed based on the third user data of the referenceable user to extract a specific context topic; and/or, according to preset context judgment conditions, a specific context topic can be defined.
  • vectorization processing is performed on the third user data of the referenceable user to obtain a multi-dimensional word vector corresponding to the third user data; based on the multi-dimensional word vector, the scene topic is obtained through a topic model.
  • the situation topic is characterized by situation topic words, and/or topic related words under the situation topic word category.
  • extract topic-related words in the first user data for each context topic word, sum the probabilities of the topic-related words under the context topic words to obtain The first user data is for the feature value of the contextual topic word.
  • For each contextual topic word obtain the feature value and value of all first user data of the target user for the contextual topic word; obtain the ratio of the feature value and value to the number of all first user data, Obtain the relevance value of the target user with respect to the context topic represented by the context topic word.
  • each piece of second user data for each reference user for the information to be recommended is acquired; for each contextual topic word, according to the second user data for the contextual topic word.
  • the feature value and the quantity of second user data corresponding to the information to be recommended are obtained to obtain the matching degree value of the information to be recommended to the context topic represented by the context topic word.
  • a target context theme that matches the target user may also be determined according to the association degree value; and the information to be recommended is displayed according to the target context theme.
  • the recommendation information related to the target scenario theme in the information to be recommended is displayed; the recommendation information includes at least one of a recommendation reason, picture information, video information, and text information.
  • An information recommendation device provided by an embodiment of the present invention is introduced in detail.
  • FIG. 3 there is shown a schematic structural diagram of an information recommendation apparatus in an embodiment of the present invention.
  • the degree of association determination module 310 is configured to determine the degree of association value between the target user and each scene theme based on the first user data of the target user.
  • the matching degree determining module 320 is configured to determine the matching degree value between the to-be-recommended information and the context topic according to the second user data corresponding to each to-be-recommended information.
  • the recommendation information matching module 330 is configured to determine the information to be recommended that matches the target user based on the association degree value and the matching degree value, and send the information to be recommended to the target user.
  • the value of the association degree between the target user and each scene theme can be determined based on the first user data of the target user; and the second user data corresponding to each information to be recommended is determined
  • the matching degree value of the context topic based on the association degree value and the matching degree value, determine the information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • An information recommendation device provided by an embodiment of the present invention is introduced in detail.
  • FIG. 4 there is shown a schematic structural diagram of an information recommendation apparatus in an embodiment of the present invention.
  • the context topic mining module 410 is used to perform data mining based on the third user data of the referenceable user to extract a specific context topic.
  • the contextual topic mining module 410 may further include:
  • the vectorization processing sub-module is configured to perform vectorization processing on the third user data of the referenceable user to obtain a multi-dimensional word vector corresponding to the third user data;
  • the scene topic mining sub-module is used to obtain the scene topic through the topic model based on the multi-dimensional word vector.
  • the vectorization processing sub-module includes:
  • the word segmentation processing unit is configured to perform word segmentation processing on the third user data of the referenceable user
  • the feature word extraction unit is used for removing invalid words in the third user data after word segmentation processing, and extracting feature words in the third user data.
  • the invalid words include at least one of stop words and high-frequency words Species
  • the multi-dimensional word vector constructing unit is configured to construct the multi-dimensional word vector of the third user data based on the characteristic word.
  • the scenario theme definition module 420 is used to define a specific scenario theme according to a preset scenario judgment condition.
  • the situation topic is characterized by situation topic words, and/or topic related words under the situation topic word category.
  • the degree of association determination module 430 is configured to determine the degree of association value between the target user and each scene theme based on the first user data of the target user.
  • the correlation degree determining module 430 may further include:
  • the feature value obtaining sub-module 431 is configured to obtain the first user data for each context topic word according to the probability of the context related words contained in each piece of first user data of the target user under each context topic word. Eigenvalues;
  • the correlation degree determination sub-module 432 is configured to obtain the correlation degree value of the target user to the context topic represented by the context topic word based on the feature value of the first user data under each context topic word.
  • the feature value obtaining submodule 431 may further include:
  • a topic-related word extraction unit configured to extract topic-related words in the first user data for each piece of first user data
  • the feature value obtaining unit is configured to sum the probabilities of the topic related words under the context topic word for each context topic word to obtain the feature value of the first user data for the context topic word.
  • the correlation degree determining submodule 432 may further include:
  • the feature value summation unit is configured to obtain the feature value and value of all the first user data of the target user for the context topic word for each context topic word;
  • the correlation degree determination unit is configured to obtain the ratio of the characteristic value and the value to the quantity of all the first user data, and obtain the correlation degree value of the target user with respect to the context topic represented by the context topic word.
  • the correlation degree value includes a weighted summation of the short-term correlation degree value and/or the long-term correlation degree value; wherein the weight of the short-term correlation degree value is greater than that of the long-term correlation degree value. Weight.
  • the matching degree determining module 440 is configured to determine the matching degree value between the to-be-recommended information and the scene theme according to the second user data corresponding to each to-be-recommended information.
  • the matching degree determining module 440 may further include:
  • the user data obtaining submodule 441 is configured to obtain, for each piece of information to be recommended, each piece of second user data of each reference user for the piece of information to be recommended;
  • the matching degree determination sub-module 442 is configured to obtain, for each contextual topic word, according to the feature value of the second user data for the contextual topic word and the number of second user data corresponding to the information to be recommended The matching degree value of the information to be recommended to the context topic represented by the context topic word.
  • the recommendation information matching module 450 is configured to determine the information to be recommended that matches the target user based on the association degree value and the matching degree value, and send the information to be recommended to the target user.
  • the recommendation information matching module 450 may further include:
  • the normalization processing sub-module 451 is configured to perform normalization processing on the correlation degree value and the matching degree value
  • the similarity determination sub-module 452 is configured to obtain the similarity between the target user and the information to be recommended based on the association degree value and the matching degree value;
  • the recommendation information matching submodule 453 is configured to select a preset number of to-be-recommended information with the highest similarity to the target user as the recommendation information of the target user, and send the recommendation information to the target user.
  • the similarity includes cosine similarity.
  • the user data includes at least one of user original content data and user portrait data.
  • the target context topic determining module 460 is configured to determine a target context topic matching the target user according to the correlation degree value.
  • the recommended information display module 470 is configured to display the information to be recommended according to the target context theme.
  • the recommended information display module may further include:
  • the recommendation information display sub-module is used to display the recommendation information related to the target scenario theme in the information to be recommended according to the target scenario theme; the recommendation information includes recommendation reason, picture information, video information, and text information. At least one of.
  • the value of the association degree between the target user and each scene theme can be determined based on the first user data of the target user; and the second user data corresponding to each information to be recommended is determined
  • the matching degree value of the context topic based on the association degree value and the matching degree value, determine the information to be recommended that matches the target user, and send the information to be recommended to the target user.
  • data mining can be performed based on the third user data of the referenceable user to extract a specific context topic; and/or, according to preset context judgment conditions, a specific context topic can be defined.
  • vectorization processing is performed on the third user data of the referenceable user to obtain a multi-dimensional word vector corresponding to the third user data; based on the multi-dimensional word vector, the scene topic is obtained through a topic model.
  • the situation topic is characterized by situation topic words, and/or topic related words under the situation topic word category.
  • extract topic-related words in the first user data for each context topic word, sum the probabilities of the topic-related words under the context topic words to obtain The first user data is for the feature value of the contextual topic word.
  • For each contextual topic word obtain the feature value and value of all first user data of the target user for the contextual topic word; obtain the ratio of the feature value and value to the number of all first user data, Obtain the relevance value of the target user with respect to the context topic represented by the context topic word.
  • each piece of second user data for each reference user for the information to be recommended is acquired; for each contextual topic word, according to the second user data for the contextual topic word.
  • the feature value and the quantity of second user data corresponding to the information to be recommended are obtained to obtain the matching degree value of the information to be recommended to the context topic represented by the context topic word.
  • the target context theme that matches the target user may also be determined according to the correlation degree value; and the information to be recommended is displayed according to the target context theme.
  • the recommendation information related to the target scenario theme in the information to be recommended is displayed; the recommendation information includes at least one of a recommendation reason, picture information, video information, and text information. In this way, the personalized display of the recommended information is realized, and the user attractiveness of the recommended information is further improved.
  • An electronic device is also disclosed in the embodiment of the present invention, including:
  • a processor a memory, and a computer program that is stored on the memory and can run on the processor, wherein the processor implements the aforementioned information recommendation method when the program is executed by the processor.
  • the embodiment of the present invention also discloses a readable storage medium.
  • the instructions in the storage medium are executed by the processor of the electronic device, the electronic device can execute the aforementioned information recommendation method.
  • the description is relatively simple, and for related parts, please refer to the part of the description of the method embodiment.
  • modules or units or components in the embodiments can be combined into one module or unit or component, and in addition, they can be divided into multiple sub-modules or sub-units or sub-components. Except that at least some of such features and/or processes or units are mutually exclusive, any combination can be used to compare all features disclosed in this specification (including the accompanying claims, abstract and drawings) and any method or methods disclosed in this manner or All the processes or units of the equipment are combined. Unless expressly stated otherwise, each feature disclosed in this specification (including the accompanying claims, abstract and drawings) may be replaced by an alternative feature providing the same, equivalent or similar purpose.
  • the various component embodiments of the present invention may be implemented by hardware, or by software modules running on one or more processors, or by a combination of them.
  • a microprocessor or a digital signal processor (DSP) may be used in practice to implement some or all of the functions of some or all of the components in the information recommendation device according to the embodiments of the present invention.
  • DSP digital signal processor
  • the present invention can also be implemented as a device or device program (for example, a computer program and a computer program product) for executing part or all of the methods described herein.
  • Such a program for realizing the present invention may be stored on a computer-readable medium, or may have the form of one or more signals. Such signals can be downloaded from Internet websites, or provided on carrier signals, or provided in any other form.
  • FIG. 5 shows a computing processing device that can implement the method according to the present application.
  • the computing processing device traditionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium.
  • the memory 1020 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the memory 1020 has a storage space 1030 for executing program codes 1031 of any method steps in the above methods.
  • the storage space 1030 for program codes may include various program codes 1031 for implementing various steps in the above method. These program codes can be read from or written into one or more computer program products.
  • These computer program products include program code carriers such as hard disks, compact disks (CDs), memory cards or floppy disks. Such computer program products are usually portable or fixed storage units as described with reference to FIG. 6.
  • the storage unit may have storage segments, storage spaces, etc. arranged similarly to the memory 1020 in the computing processing device of FIG. 5.
  • the program code can be compressed in a suitable form, for example.
  • the storage unit includes computer-readable codes 1031', that is, codes that can be read by, for example, a processor such as 1010. These codes, when run by a computing processing device, cause the computing processing device to execute the method described above. The various steps.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种信息推荐方法、装置、电子设备及可读存储介质。所述方法,包括:基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值(110);根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值(120);基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户(130)。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。

Description

信息推荐
本申请要求在2019年1月28日提交中国专利局、申请号为201910081528.9、发明名称为“信息推荐方法、装置、电子设备及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及互联网技术领域,具体涉及一种信息推荐方法、装置、电子设备及可读存储介质。
背景技术
随着移动互联网业务的兴起,人们可以方便的通过移动终端访问网络,进而获取或定制所需的服务,因此O2O(Online To Offline,/线上到线下)模式应运而生。这种模式的本质是使得用户和服务彼此之间更为便捷的发现,用户随时可以在线上选择自己需要的线下服务;而通过对用户画像及商户信息的挖掘,又可以向用户进行信息推荐,进而提升用户体验,并帮商户进行发现客户。
但现有的信息推荐方案的核心思想是挖掘相似的用户或者相似待推荐信息,通过相似用户进行推荐或者给用户推荐相似待推荐信息。由此可见,现有的信息推荐方法多在于优化推荐算法,即如何更精准的匹配用户和商户,而并未关注用户实际所处的情境,因此基于现有信息推荐方法确定的推荐结果精准性和多样性存在不足,且对用户的吸引力较低。
发明内容
本发明提供一种信息推荐方法、装置、电子设备及可读存储介质,以部分或全部解决现有技术中信息推荐过程相关的上述问题。
依据本发明第一方面,提供了一种信息推荐方法,包括:
基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;
根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;
基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
根据本发明的第二方面,提供了一种信息推荐装置,包括:
关联程度确定模块,用于基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;
匹配程度确定模块,用于根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;
推荐信息匹配模块,用于基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
根据本发明的第三方面,提供了一种电子设备,包括:
处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现前述的信息推荐方法。
根据本发明的第四方面,提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现本申请实施例公开的前述的信息推荐方法的步骤。
根据本发明的信息推荐方法,可以基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。
上述说明仅是本发明技术方案的概述,为了能够更清楚了解本发明的技术手段,而可依照说明书的内容予以实施,并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂,以下特举本发明的具体实施方式。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面 描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了根据本发明一个实施例的一种信息推荐方法的步骤流程图;
图2示出了根据本发明一个实施例的一种信息推荐方法的步骤流程图;
图3示出了根据本发明一个实施例的一种信息推荐装置的结构示意图;以及
图4示出了根据本发明一个实施例的一种信息推荐装置的结构示意图。
图5示意性地示出了用于执行根据本申请的方法的计算处理设备的框图。
图6示意性地示出了用于保持或者携带实现根据本申请的方法的程序代码的存储单元。
具体实施例
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
实施例一
详细介绍本发明实施例提供的一种信息推荐方法。
参照图1,示出了本发明实施例中一种信息推荐方法的步骤流程图。
步骤110,基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值。
在本发明实施例中,在向用户推荐信息时,为了提高推荐信息与相应用户所处情景的匹配程度,从而进一步提高推荐信息与相应用户之间的匹配精准性。针对需要进行信息推荐的目标用户,可以获取目标用户的第一用户数据,并且基于第一用户数据确定相应的目标用户针对各个情景主题的关联程度值。
其中的第一用户数据可以包括任何可以获取得到的与目标用户相关的数据。例如,用户画像数据、UGC(User Generated Content,用户原创内容)数据、PGC(Professionally-generated Content,专业生产内容)数据、OGC(Occupationally-generated Content,职业生产内容)数据,用户定位数据、 当前的POI(Point of Interest,兴趣点)数据,等等。其中的POI数据又可以包括但不限于POI标签、POI的UGC内容、POI关联的文章,等等。在本发明实施例中,可以根据需求预先设置第一用户数据所包含的具体内容,而且在本发明实施例中,可以通过任何可用方法获取得到第一用户数据,对此本发明实施例均不加以限定。
其中的情景主题也可以根据需求通过任何可用方法进行预先定义,例如可以通过专家定义的方式设定情景主题,也可以通过大量的参考数据进行数据挖掘,从而挖掘出情景主题,等等。而且情景主题的表示方式可以包括一个情景主题由至少一个词进行表示,等等。
例如,可以设置某一情景主题为浪漫约会的情景主题,或者某一情景主题为聚餐的情景主题,等等。
而且,在本发明实施例中可以采用任何可用方法基于第一用户数据确定相应的目标用户对各个情景主题的关联程度,对此本发明实施例不加以限定。例如,可以基于第一用户数据与各个情景主题的匹配程度作为相应的目标用户与相应情景主题的关联程度值,等等。
步骤120,根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值。
在实际应用中,不同用户针对某一待推荐信息的第二用户数据可以在一定程度上表征相应的待推荐信息。例如,如果某一待推荐信息对应的第二用户数据中包含了多个用户选择该待推荐信息对应的地点作为约会地,那么则可以推知相应的待推荐信息与约会的情景主题的匹配程度较高。
因此在本发明实施例中,为了给目标用户推荐与其所处情景匹配度较高的待推荐信息,同样地可以根据各待推荐信息对应的第二用户数据,确定相应的待推荐信息与每个情景主题的匹配程度值。具体的可以通过任何可用方式基于待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值,对此本发明实施例不加以限定。
而且,在本发明实施例中,待推荐信息可以为任何一种可以推荐给用户的信息,例如可以包括但不限于针对至少一个物品的推荐信息、针对至少一个场所的推荐信息、针对至少一个网页的推荐信息,等等。具体的待推荐信息可以根据需求进行设定,对此本发明实施例不加以限定。
另外,各推荐信息对应的第二用户数据可以包括针对相应的推荐信息进行购买、浏览、分享等操作的相关用户的第二用户数据。而且第二用户数据 也可以包括但不限于前述的用户画像数据、UGC数据、PGC数据、OGC数据,用户定位数据、POI数据,等等。具体的第二用户数据所包含的数据类型可以根据需求进行预先设置,对此本发明实施例不加以限定。
步骤130,基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
如前述,上述的关联程度值可以表征目标用户与各个情景主题的相关程度,而匹配程度则可以表征各个待推荐信息与各情景主题的相关程度。在本方案的目的在于基于目标用户所处情景向目标用户推荐相应的信息。因此,则可以基于关联程度值以及匹配程度值,确定与目标用户匹配的待推荐信息,进而可以将相应的待推荐信息发送至相应的目标用户。
其中,基于关联程度值可以确定与目标用户对应的情景主题,而基于匹配程度值则可以确定与相应的情景主题匹配的待推荐信息,进而得到目标用户匹配的待推荐信息;或者在本发明实施例中,还可以直接基于关联程度值和匹配程度值,获取相应待推荐信息与目标用户的关联程度,进而基于关联程度从待推荐信息中选出与目标用户匹配的待推荐信息,等等。
而且,在本发明实施例中,可以通过任何可用方法将确定的待推荐信息发送至目标用户,对此本发明实施例不加以限定。
根据本发明的信息推荐方法,可以基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。
实施例二
详细介绍本发明实施例提供的一种信息推荐方法。
参照图2,示出了本发明实施例中一种信息推荐方法的步骤流程图。
步骤210,基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题。
在实际应用中,如果由用户自定义情景主题类型,那么由于不同用户的需求不同所定义的情景主题也会各不相同,而且对于同样的情景不同用户可 能设定不同的情景主题,从而容易导致情景主题混乱,容易影响推荐信息的准确性。因此,在本发明实施例中,为了避免上述情况,可以预先统一定义不同的情景主题。具体的可以基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题。其中可参考用户的第三用户数据具体可以包括任何可以获取得到以进行情景主题挖掘的用户数据。例如,团购、外卖等平台上用户的用户数据,等等。
而且第三用户数据也可以包括但不限于前述的用户画像数据、UGC数据、PGC数据、OGC数据,用户定位数据、POI数据,等等。具体的第三用户数据所包含的数据类型可以根据需求进行预先设置,对此本发明实施例不加以限定。
而且,在本发明实施例中,基于可参考用户的第三用户数据,可以采用任何可用方法进行数据挖掘,从而提取出特定的情景主题,对此本发明实施例不加以限定。
可选地,在本发明实施例中,所述步骤210进一步可以包括:
子步骤211,对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量。
在本发明实施例中,为了方便从可参考用户的第三用户数据中挖掘出情景主题,可以对可参考用户的第三用户数据进行向量化处理,从而得到第三用户数据对应的多维词向量。具体的可以通过任何可用的向量处理方法对可参考用户的第三用户数据进行向量化处理,对此本发明实施例不加以限定。
例如,可以通过doc2vec模型对可参考用户的第三用户数据进行向量化处理,或者可以通过word2vec模型对可参考用户的第三用户数据进行向量化处理,等等。而且,word2vec模型又可以包括skip-gram模型、连续词袋(continuous bag-of-word,CBOW)模型,等等。
可选地,在本发明实施例中,所述子步骤211,进一步可以包括:
子步骤2111,对所述可参考用户的第三用户数据进行分词处理;
在实际应用中,为了将第三用户数据进行向量化处理,则可以对可参考用户的第三用户数据进行分词处理,具体的可以采用任何可用的分词处理方法对可参考用户的第三用户数据进行分词处理,对此本发明实施例不加以限定。
子步骤2112,去除分词处理后的第三用户数据中的无效词,并提取所述第三用户数据中的特征词,所述无效词包括停用词、高频词中的至少一种;
在实际应用中,用户数据中的部分词语对情景主题的确定并没有任何作用,可以定义此类词为无效词,因此在生成多维向量时也可以不用考虑无效词,那么对于经分词处理后的第三用户数据,则可以进一步去除其中的无效词,进而提取出特征词。其中的无效词可以包括但不限于停用词(Stop Words)、高频词,等等。
其中停用词的具体内容可以根据需求进行预设设置,对此本发明实施例不加以限定。例如可以参照现有的如哈工大停用词表、百度停用词表等停用词表设置停用词,等等。而且,在本发明实施例中,针对具体业务,可以按业务需要,专门整理对业务无帮助或无意义的词。甚至停用词中可以包括停用“句”,如针对电商的“此用户没有发表评论。”也可以设置为停用词。
对于经分词处理后得到的各个分词,在去除其中的无效词之后,则可以将剩下的分词直接作为特征词;或者,也可以进一步对去除无效词后的得到的各个分词进行进一步过滤,进而得到特征词,具体的过滤策略可以根据需求进行预先设置,对此本发明实施例不加以限定。
而且,高频词所对应的词频范围也可以根据需求进行预先设置,对此本发明实施例也不加以限定。
子步骤2113,基于所述特征词构建所述第三用户数据的多维词向量。
在提取出特征词之后,则可以基于特征词构建得到可参考用户的第三用户数据的多维词向量。
其中采用不同的向量化模型,得到的多维词向量的形式可以有所不同。例如,多维词向量可以为one-hot编码形式,或者是TF-IDF(term frequency-inverse document frequency,词频-逆文档频率)形式,等等。具体的可以根据需求进行预先设置,对此本发明实施例不加以限定。
子步骤212,基于所述多维词向量,通过主题模型得到所述情景主题。
在得到多维词向量之后,则可以基于多维词向量,通过主题模型挖掘得到情景主题。其中的主题模型具体的可以为任意一种可用的主题模型,例如LDA(Latent Dirichlet Allocation,文档主题生成模型)主题模型、Sentence LDA主题模型、Copula LDA主题模型,等等。
步骤220,根据预设的情景判断条件,定义特定的情境主题。
另外,在本发明实施例中,为了避免数据挖掘得到的情景主题类型不够全面,或者是不够准确,还可以根据预设的情景判断条件,定义特定的情景主题。其中的情景判断条件则可以根据需求进行预先设置,对此本发明实施 例不加以限定。例如,可以考虑用户的实际意图及场景,设置不同的情景判断条件,并且定义浪漫约会的情境,定义聚餐情境,等等。
需要说明的是,在本发明实施例中,可以基于步骤210和/或步骤220获取得到情景主题,对此本发明实施例不加以限定。
可选地,在本发明实施例中,所述情景主题由情景主题词,和/或在所述情景主题词类别下的主题相关词所表征。
在本发明实施例中,为了准确表征各个情景主题,可以设置每个情景主题由至少一个情景主题词,和/或在相应的情景主题词类别下的至少一个主题相关词所表征。
例如,对于某一情景主题,其情景主题词可以为“约会”,而在该情景主题词类别下类包含“浪漫”、“烛光晚餐”、“鲜花”,等等主题相关词。
步骤230,根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值。
在本发明实施例中,以LDA模型为例,将上述的多维词向量输入LDA模型,进而可以得到word-topic矩阵,其中的word则可以理解为本发明实施例中的主题相关词,而topic则可以理解为本发明中的情景主题词。另外,基于上述的主题模型还可以得到各个情景相关词在相应的情景主题词下的概率。
那么在本发明实施例中,为了得到目标用户对某一情景主题的关联程度值,而且由于目标用户可能对应至少一条第一用户数据,而且同一用户对应的不同第一用户数据具体所包含的内容也可以有所不同,因此在本发明实施例中,可以目标用户的每条第一用户数据为单位,分别根据每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,从而获取相应每条第一用户数据针对各个情景主题词的特征值。
其中,某一第一用户数据针对某一情景主题词的特征值,与相应的第一用户数据所包含的情景相关词在相应的情景主题词下的概率之间的对应关系可以根据需求进行预设设置,对此本发明实施例不加以限定。
可选地,在本发明实施例中,所述步骤230进一步可以包括:
子步骤231,针对每条第一用户数据,提取所述第一用户数据中的主题相关词;
同样地,由于第一用户数据中包含的内容比较多,而且其中可能存在至 少一个主题相关词,另外还可以包括其他无关词,那么在本发明实施例中,为了确定每条第一用户数据对应的情景主题词的特征值,可以先针对每条第一用户数据,提取其中的主题相关词。
而且,在本发明实施例中,为了提高提取主题相关词的效率以及准确性,也可以对每条第一用户数据进行预处理。其中的预处理可以包括分词处理、去除无效词处理,等等。其中的无效词也可以包括上述的高频词和停用词,等等。
子步骤232,针对每个情景主题词,对所述主题相关词在所述情景主题词下的概率进行求和,得到所述第一用户数据针对所述情景主题词的特征值。
在获取得到第一用户数据中包含的情景相关词之后,为了确定第一用户数据对应在每个情景主题下的特征值,则可以针对每个情景主题词,对从相应的第一用户数据中提取出的主题相关词在相应的情景主题词下的概率进行求和,得到相应的第一用户数据针对相应情景主题词的特征值。
而且,在本发明实施例中,在从第一用户数据中提取出情景相关词之后,则可以得知提取出的各个情景相关词对应的情景主题词,进而可以得到相应的第一用户数据对应的情景主题词。而如果某一第一用户数据中不包含某一情景主题词下的全部情景相关词,那么则说明该第一用户数据针对该情景主题词的特征值为零。因此,在本发明实施例中,可以只针对相应的第一用户数据对应的每个情景主题词,对提取出的对应相应情景主题词的主题相关词在相应情景主题词下的概率进行求和,得到相应第一用户数据针对相应情景主题词的特征值。
假设,F uik表示目标用户u对信息i的第一用户数据a的针对情景主题词k的特征值,那么特征值可以按照下面的方式进行计算:
Figure PCTCN2019125054-appb-000001
其中,n是从第一用户数据a中提取出的属于情景主题词k分类下的情境相关词的个数;f uikt表示的是情境相关词t在情景主题词k下的概率,而如果情景主题词k下不包含情境相关词t,那么f uikt可以为0。
步骤240,基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值。
在本发明实施例中,在确定目标用户的每条第一用户数据在各个情景主题词下的特征值之后,则可以进一步基于目标用户全部的第一用户数据在各 个情景主题词下的特征值,获取相应的目标用户对各个情景主题的关联程度值。
其中,特征值与关联程度值之间的对应关系可以根据需求进行预先设置,对此本发明实施例不加以限定。
例如,可以设置目标用户针对某一情景主题的关联程度值,为该目标用户的每条第一用户数据针对相应情景主题的情景主题词的特征值平均值;或者可以设置目标用户针对某一情景主题的关联程度值,为该目标用户的每条第一用户数据针对相应情景主题的情景主题词的特征值的加权平均值,其中的权重则可以根据需求进行预先设置,等等。
可选地,在本发明实施例中,所述步骤240进一步可以包括:
子步骤241,针对每个情景主题词,获取所述目标用户的全部第一用户数据针对所述情景主题词的特征值和值;
如前述,而且目标用户的每条第一用户数据都可以在一定程度上表征其当前所处环境,因此在本发明实施例中,为了确定目标用户当前对每个情景主题的关联程度,可以针对每个情景主题词,分别获取目标用户的全部第一用户数据针对所述情景主题词的特征值和值。
例如,对于情景主题词k,假设第一用户数据1对该情景主题词k的特征值为n1,第一用户数据2对情景主题词k的特征值为n2,第一用户数据3对情景主题词k的特征值为n3。此时,可以得到目标用户的全部第一用户数据针对情景主题词k的特征值和值为n1+n2+n3。
需要说明的是,其中的全部第一用户数据可以是可以获取得到的目标用户的全部第一用户数据,也可以是根据预设的第一用户数据获取条件获取得到的全部第一用户数据,等等。其中的第一用户数据获取条件则可以根据需求进行预先设置,对此本发明实施例不加以限定。例如,可以设置第一用户数据获取条件为获取发布时间到当前时间时间的时间差在预设时间差范围内的第一用户数据,或者获取针对预设信息类型的第一用户数据,等等。
子步骤242,获取所述特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。
为了综合考虑各个第一用户数据对目标用户所处情景的表征作用,在本发明实施例中,可以每个第一用户数据针对同一情景主题词的特征值平均值作为其对相应情景主题的关联程度值,那么此时则可以获取针对某一情景主 题词的特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。
例如,对于上述的情景主题词k,目标用户的全部第一用户数据针对情景主题词k的特征值和值为n1+n2+n3,而且此时的全部第一用户数据具体包括三条第一用户数据,那么可以得到目标用户针对情景主题词k所表征的情景主题的关联程度为(n1+n2+n3)/3。
可选地,在本发明实施例中,所述关联程度值包括短期关联程度值和/或长期关联程度值的加权求和;其中,所述短期关联程度值的权值大于长期关联程度值的权值。
另外在实际应用中,用户可能只在一段时间内对某一情景比较感兴趣,也可能长期对某些情景感兴趣,因此按照时效性用户的兴趣可以分为短期性趣和长期关联程度值,那么相应的目标用户对各个情景主题的关联程度值也可以包含了短期关联程度值和/或长期关联程度值的加权求和,其中短期关联程度值和长期关联程度值的权值都可以根据需求进行预先设置,对此本发明实施例不加以限定。但是在实际应用中一般短期兴趣更能表征目标用户当前的兴趣所在,因此可以设置所述短期关联程度值的权值大于长期关联程度值的权值,而且还可以设置短期关联程度值的权值与长期关联程度值的权值之和为1。
那么此时目标用户u针对某一情景主题的关联程度值可以表示为:
Figure PCTCN2019125054-appb-000002
式中的
Figure PCTCN2019125054-appb-000003
表示目标用户针对相应情景主题的短期关联程度值,
Figure PCTCN2019125054-appb-000004
目标用户针对相应情景主题的长期关联程度值。其中α与β分别为短期关联程度值和长期关联程度值的权值。
其中,短期关联程度值可以为考虑预设的一段短期时间内目标用户的第一用户数据所对应的关联程度值,通常为较为短时间的动态兴趣,而长期兴趣则可以为至少一个预设较长的历史时间段内兴趣,主要根据用户属性及长期的偏好确定的值,例如用户是新生儿母亲,长期会对亲子类目有潜在兴趣。具体的长期兴趣对应的用户属性的类型,以及每个预设较短历史时间段的短期关联程度值都可以根据需求进行预先设置,对此本发明实施例不加以限定。
例如,可以设置短期关联程度值为当前时间之前的预设时间段内目标用 户的第一用户数据所对应的关联程度值,可以根据用户当前小时、当天或者其他较短时间内的兴趣程度,而长期兴趣则可以为用户的基础属性设置的兴趣,例如年轻女性对于美妆的偏好,或者年轻男性对于运动健身的偏好等等,或者可以当前用户在重复的点击浏览火锅商户,我们也可以判断用户潜在对火锅品类感兴趣。或者也可以直接设置所述长期关联程度值是至少一个预设历史时间段内的短期关联程度值的和值。
步骤250,针对每个待推荐信息,获取每个可参考用户针对所述待推荐信息的每条第二用户数据。
在本发明实施例中,在确定目标用户对各个情景主题的关联程度值之后,为了向目标用户推荐适用的待推荐信息,还需要确定各个待推荐信息与每个情景主题的匹配程度值。具体的则可以参考某一待推荐信息所对应的第二用户数据从而确定该待推荐信息与各个情景主题的匹配程度,因此首先,针对每个待推荐信息,可以获取每个可参考用户针对该待推荐信息的每条第二用户数据。其中的可参考用户也即上述的全部第二用户数据所对应的可参考用户。
例如,对于待推荐信息i,假设可参考用户u1针对待推荐信息i的第二用户数据是s1,可参考用户u2针对待推荐信息i的第二用户数据是s2,那么针对待推荐信息i,则可以获取上述的第二用户数据s1和s2。
步骤260,针对每个情景主题词,根据所述第二用户数据针对所述情景主题词的特征值,以及所述待推荐信息对应的第二用户数据的数量,获取所述待推荐信息对所述情景主题词所表征的情景主题的匹配程度值。
其中,匹配程度值与特征值和用户数据数量之前的对应关系可以根据需求进行预先设置,对此本发明实施例不加以限定。例如,可以设置某一待推荐信息对某一情景主题词所表征的情景主题的匹配程度值为相应待推荐信息所对应的全部第二用户数据针对相应情景主题词的特征值和值,与相应待推荐信息对应的全部第二用户数据的数量的比值。
例如,假设针对情景主题词k,针对待推荐信息i的第二用户数据s1对该情景主题词的特征值为f uik,那么此时待推荐信息i对情景主题词k所表征的情景主题的匹配程度值为:
T ik=∑ uf uik/|C i|
其中,∑ uf uik示待推荐信息i所对应的全部第二用户数据针对相应情景主题词k的特征值和值,|C i|表示待推荐信息i对应的全部第二用户数据的数量。
另外,需要说明的是,在本发明实施例中,也可以预先确定各个待推荐信息与各个情景主题的匹配程度,而且由于第二用户数据会不断更新,那么也可以预设时间段为时间间隔,周期性地获取以当前可参考用户的第二用户数据重新确定各个待推荐信息与各个情景主题的匹配程度。
而且,如果当前已经确定目标用户对各个情景主题的关联程度值,那么此时还可以选取关联程度值最高的M个情景主题,进而可以只获取各待推荐信息与该M个情景主题的匹配程度即可,而无需获取各待推荐信息与每个情景主题的匹配程度。
步骤270,对所述关联程度值和所述匹配程度值进行归一化处理。
另外,在本发明实施例中,为了统一对物品情境相关信息的关联程度,可以将关联程度值和所述匹配程度值进行归一化处理。具体的归一化处理方法可以根据需求进行预先设置,对此本发明实施例不加以限定。
步骤280,基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度。
在得到目标用户对各个情景主题的关联程度值,以及各个待推荐信息与各个情景主题的匹配程度值之后,则可以进一步从待推荐信息中筛选出目标用户感兴趣的待推荐信息。此时,可以进一步基于关联程度值以及所述匹配程度值,获取所述目标用户与每个待推荐信息之间的相似度。具体的可以通过任何可用的相似度确定方法获取目标用户与各个待推荐信息之间的相似度,对此本发明实施例不加以限定。例如,欧几里得距离(Eucledian Distance)相似度、曼哈顿距离(Manhattan Distance)相似度、明可夫斯基距离(Minkowski distance)相似度、余弦相似度(Cosine Similarity),等等。
可选地,在本发明实施例中,所述相似度包括余弦相似度。
如果相似度为余弦相似度,那么目标用户u与待推荐信息i之前的相似度为:
Cos(I u,T i)=|I u·T i|/||I u||||T i||
式中I u目标用户的兴趣程度向量,I u=[I u1,I u2,I u3,...,I uk],T i表示物品i的匹配程度向量,T i=[T i1,T i2,T i3,...,T ik],k∈[1,K],其中I uk表示目标用户u对第k个情景主题的关联程度值,T ik表示待推荐信息对第k个情景主题的匹配程度值,K表示情景主题的总数量。
步骤290,选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目标用户的推荐信息,并将所述推荐信息发送至所述目标用户。
在确定目标用户与各个待推荐信息之间的相似度之后,则可以选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目标用户的推荐信息,并将所述推荐信息发送至所述目标用户。其中的预设数量也可以根据需求进行预先设置,对此本发明实施例不加以限定。
可选地,在本发明实施例中,所述用户数据包括用户原创内容数据、用户画像数据中的至少一种。
步骤2110,根据所述关联程度值确定与所述目标用户匹配的目标情景主题。
步骤2120,根据所述目标情景主题,展示所述待推荐信息。
可选地,在本发明实施例中,所述步骤2120进一步可以包括:根据所述目标情景主题,展示所述待推荐信息中与所述目标情景主题相关的推荐信息,所述推荐信息包括推荐理由、图片信息、视频信息、文字信息中的至少一种。
另外,在本发明实施例中,还可以根据目标用户所处的情景,个性化展示相应的待推荐信息,具体的可以根据前述确定的关联程度值确定与目标用户匹配的目标情景主题,进而可以根据目标情景主题展示选择得到的待推荐信息。具体的可以优先展示待推荐信息与目标情景主题相关的部分,或者如果目标用户当前所处情景为无线网络环境,那么此时则可以通过动画或者高清图片等方式展示选定的待推荐信息,而如果目标用户当前所处情景为户外环境,那么此时则可以通过语音方式展示选定的推荐信息,等等。
优选地,可以展示选定的待推荐信息中与所述目标情景主题相关的推荐信息,所述推荐信息可以包括但不限于与所述目标情景主题匹配的推荐理由、图片信息、视频信息、文字信息中的至少一种。
根据本发明的信息推荐方法,可以基于目标用户的第一用户数据确定所 述目标用户与各个情景主题的关联程度值;根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。
而且,在本发明实施例中,还可以基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题;和/或,根据预设的情景判断条件,定义特定的情境主题。并且,对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量;基于所述多维词向量,通过主题模型得到所述情景主题。从而可以提高情景主题的全面性和准确性,进而提高推荐信息的精确性和用户吸引力。
另外,在本发明实施例中,所述情景主题由情景主题词,和/或在所述情景主题词类别下的主题相关词所表征。而且还可以根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值;基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值。进而,针对每条第一用户数据,提取所述第一用户数据中的主题相关词;针对每个情景主题词,对所述主题相关词在所述情景主题词下的概率进行求和,得到所述第一用户数据针对所述情景主题词的特征值。针对每个情景主题词,获取所述目标用户的全部第一用户数据针对所述情景主题词的特征值和值;获取所述特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。并且,针对每个待推荐信息,获取每个可参考用户针对所述待推荐信息的每条第二用户数据;针对每个情景主题词,根据所述第二用户数据针对所述情景主题词的特征值,以及所述待推荐信息对应的第二用户数据的数量,获取所述待推荐信息对所述情景主题词所表征的情景主题的匹配程度值。基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度;选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目标用户的推荐信息,并将所述推荐信息发送至所述目标用户。从而可以进一步提高得到推荐信息的效率以及精确性。
进一步地,在本发明实施例中,还可以根据所述关联程度值确定与所述 目标用户匹配的目标情景主题;根据所述目标情景主题,展示所述待推荐信息。根据所述目标情景主题,展示所述待推荐信息中与所述目标情景主题相关的推荐信息;所述推荐信息包括推荐理由、图片信息、视频信息、文字信息中的至少一种。从而实现对待推荐信息的个性化展示,进一步提高推荐信息的用户吸引力。
对于方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明实施例并不受所描述的动作顺序的限制,因为依据本发明实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本发明实施例所必须的。
实施例三
详细介绍本发明实施例提供的一种信息推荐装置。
参照图3,示出了本发明实施例中一种信息推荐装置的结构示意图。
关联程度确定模块310,用于基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值。
匹配程度确定模块320,用于根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值。
推荐信息匹配模块330,用于基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
根据本发明的信息推荐方法,可以基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。
实施例四
详细介绍本发明实施例提供的一种信息推荐装置。
参照图4,示出了本发明实施例中一种信息推荐装置的结构示意图。
情景主题挖掘模块410,用于基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题。
可选地,在本发明实施例中,所述情景主题挖掘模块410,进一步可以包括:
向量化处理子模块,用于对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量;
情景主题挖掘子模块,用于基于所述多维词向量,通过主题模型得到所述情景主题。
可选地,在本发明实施例中,所述向量化处理子模块,包括:
分词处理单元,用于对所述可参考用户的第三用户数据进行分词处理;
特征词提取单元,用于去除分词处理后的第三用户数据中的无效词,并提取所述第三用户数据中的特征词,所述无效词包括停用词、高频词中的至少一种;
多维词向量构建单元,用于基于所述特征词构建所述第三用户数据的多维词向量。
情景主题定义模块420,用于根据预设的情景判断条件,定义特定的情境主题。
可选地,在本发明实施例中,所述所述情景主题由情景主题词,和/或在所述情景主题词类别下的主题相关词所表征。
关联程度确定模块430,用于基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值。
其中,所述关联程度确定模块430,进一步可以包括:
特征值获取子模块431,用于根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值;
关联程度确定子模块432,用于基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值。
可选地,在本发明实施例中,所述特征值获取子模块431,进一步可以包括:
主题相关词提取单元,用于针对每条第一用户数据,提取所述第一用户数据中的主题相关词;
特征值获取单元,用于针对每个情景主题词,对所述主题相关词在所述情景主题词下的概率进行求和,得到所述第一用户数据针对所述情景主题词 的特征值。
可选地,在本发明实施例中,所述关联程度确定子模块432,进一步可以包括:
特征值求和单元,用于针对每个情景主题词,获取所述目标用户的全部第一用户数据针对所述情景主题词的特征值和值;
关联程度确定单元,用于获取所述特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。
可选地,在本发明实施例中,所述关联程度值包括短期关联程度值和/或长期关联程度值的加权求和;其中,所述短期关联程度值的权值大于长期关联程度值的权值。
匹配程度确定模块440,用于根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值。
其中,在本发明实施例中,所述匹配程度确定模块440,进一步可以包括:
用户数据获取子模块441,用于针对每个待推荐信息,获取每个可参考用户针对所述待推荐信息的每条第二用户数据;
匹配程度确定子模块442,用于针对每个情景主题词,根据所述第二用户数据针对所述情景主题词的特征值,以及所述待推荐信息对应的第二用户数据的数量,获取所述待推荐信息对所述情景主题词所表征的情景主题的匹配程度值。
推荐信息匹配模块450,用于基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
其中,在本发明实施例中,所述推荐信息匹配模块450,进一步可以包括:
归一化处理子模块451,用于对所述关联程度值和所述匹配程度值进行归一化处理;
相似度确定子模块452,用于基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度;
推荐信息匹配子模块453,用于选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目标用户的推荐信息,并将所述推荐信息发送至 所述目标用户。
可选地,在本发明实施例中,所述相似度包括余弦相似度。
可选地,在本发明实施例中,所述用户数据包括用户原创内容数据、用户画像数据中的至少一种。
目标情景主题确定模块460,用于根据所述关联程度值确定与所述目标用户匹配的目标情景主题。
推荐信息展示模块470,用于根据所述目标情景主题,展示所述待推荐信息。
可选地,在本发明实施例中,所述推荐信息展示模块,进一步可以包括:
推荐信息展示子模块,用于根据所述目标情景主题,展示所述待推荐信息中与所述目标情景主题相关的推荐信息;所述推荐信息包括推荐理由、图片信息、视频信息、文字信息中的至少一种。
根据本发明的信息推荐方法,可以基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。由此解决了现有的信息推荐方法精准性和多样性存在不足,且对用户的吸引力较低的技术问题。取得了提高推荐信息的精准性、多样性以及用户吸引力的有益效果。
而且,在本发明实施例中,还可以基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题;和/或,根据预设的情景判断条件,定义特定的情境主题。并且,对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量;基于所述多维词向量,通过主题模型得到所述情景主题。从而可以提高情景主题的全面性和准确性,进而提高推荐信息的精确性和用户吸引力。
另外,在本发明实施例中,所述情景主题由情景主题词,和/或在所述情景主题词类别下的主题相关词所表征。而且还可以根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值;基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值。进而,针对每条第一用户数据,提取所述第一用户数据中的主题相关词;针对每个情景主题词,对所述主题相关词在所述情景主题词 下的概率进行求和,得到所述第一用户数据针对所述情景主题词的特征值。针对每个情景主题词,获取所述目标用户的全部第一用户数据针对所述情景主题词的特征值和值;获取所述特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。并且,针对每个待推荐信息,获取每个可参考用户针对所述待推荐信息的每条第二用户数据;针对每个情景主题词,根据所述第二用户数据针对所述情景主题词的特征值,以及所述待推荐信息对应的第二用户数据的数量,获取所述待推荐信息对所述情景主题词所表征的情景主题的匹配程度值。基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度;选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目标用户的推荐信息,并将所述推荐信息发送至所述目标用户。从而可以进一步提高得到推荐信息的效率以及精确性。
进一步地,在本发明实施例中,还可以根据所述关联程度值确定与所述目标用户匹配的目标情景主题;根据所述目标情景主题,展示所述待推荐信息。根据所述目标情景主题,展示所述待推荐信息中与所述目标情景主题相关的推荐信息;所述推荐信息包括推荐理由、图片信息、视频信息、文字信息中的至少一种。从而实现对待推荐信息的个性化展示,进一步提高推荐信息的用户吸引力。
本发明实施例中还公开了一种电子设备,包括:
处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述程序时实现前述的信息推荐方法。
本发明实施例中还公开了一种可读存储介质,当所述存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行前述的信息推荐方法。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
在此提供的算法和显示不与任何特定计算机、虚拟系统或者其它设备固有相关。各种通用系统也可以与基于在此的示教一起使用。根据上面的描述,构造这类系统所要求的结构是显而易见的。此外,本发明也不针对任何特定编程语言。应当明白,可以利用各种编程语言实现在此描述的本发明的内容,并且上面对特定语言所做的描述是为了披露本发明的最佳实施方式。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本 发明的实施例可以在没有这些具体细节的情况下实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
类似地,应当理解,为了精简本公开并帮助理解各个发明方面中的一个或多个,在上面对本发明的示例性实施例的描述中,本发明的各个特征有时被一起分组到单个实施例、图、或者对其的描述中。然而,并不应将该公开的方法解释成反映如下意图:即所要求保护的本发明要求比在每个权利要求中所明确记载的特征更多的特征。更确切地说,如下面的权利要求书所反映的那样,发明方面在于少于前面公开的单个实施例的所有特征。因此,遵循具体实施方式的权利要求书由此明确地并入该具体实施方式,其中每个权利要求本身都作为本发明的单独实施例。
本领域那些技术人员可以理解,可以对实施例中的设备中的模块进行自适应性地改变并且把它们设置在与该实施例不同的一个或多个设备中。可以把实施例中的模块或单元或组件组合成一个模块或单元或组件,以及此外可以把它们分成多个子模块或子单元或子组件。除了这样的特征和/或过程或者单元中的至少一些是相互排斥之外,可以采用任何组合对本说明书(包括伴随的权利要求、摘要和附图)中公开的所有特征以及如此公开的任何方法或者设备的所有过程或单元进行组合。除非另外明确陈述,本说明书(包括伴随的权利要求、摘要和附图)中公开的每个特征可以由提供相同、等同或相似目的的替代特征来代替。
此外,本领域的技术人员能够理解,尽管在此所述的一些实施例包括其它实施例中所包括的某些特征而不是其它特征,但是不同实施例的特征的组合意味着处于本发明的范围之内并且形成不同的实施例。例如,在下面的权利要求书中,所要求保护的实施例的任意之一都可以以任意的组合方式来使用。
本发明的各个部件实施例可以以硬件实现,或者以在一个或者多个处理器上运行的软件模块实现,或者以它们的组合实现。本领域的技术人员应当理解,可以在实践中使用微处理器或者数字信号处理器(DSP)来实现根据本发明实施例的信息推荐设备中的一些或者全部部件的一些或者全部功能。本发明还可以实现为用于执行这里所描述的方法的一部分或者全部的设备或者装置程序(例如,计算机程序和计算机程序产品)。这样的实现本发明的程序可以存储在计算机可读介质上,或者可以具有一个或者多个信号的形式。这样的信号可以从因特网网站上下载得到,或者在载体信号上提供,或 者以任何其他形式提供。
例如,图5出了可以实现根据本申请的方法的计算处理设备。该计算处理设备传统上包括处理器1010和以存储器1020形式的计算机程序产品或者计算机可读介质。存储器1020可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。存储器1020具有用于执行上述方法中的任何方法步骤的程序代码1031的存储空间1030。例如,用于程序代码的存储空间1030可以包括分别用于实现上面的方法中的各种步骤的各个程序代码1031。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。这些计算机程序产品包括诸如硬盘,紧致盘(CD)、存储卡或者软盘之类的程序代码载体。这样的计算机程序产品通常为如参考图6述的便携式或者固定存储单元。该存储单元可以具有与图5计算处理设备中的存储器1020类似布置的存储段、存储空间等。程序代码可以例如以适当形式进行压缩。通常,存储单元包括计算机可读代码1031’,即可以由例如诸如1010之类的处理器读取的代码,这些代码当由计算处理设备运行时,导致该计算处理设备执行上面所描述的方法中的各个步骤。
应该注意的是上述实施例对本发明进行说明而不是对本发明进行限制,并且本领域技术人员在不脱离所附权利要求的范围的情况下可设计出替换实施例。在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本发明可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。

Claims (17)

  1. 一种信息推荐方法,其特征在于,包括:
    基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;
    根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;
    基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息。
  2. 根据权利要求1所述的方法,其特征在于,在所述基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值的步骤之前,还包括:
    基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题;
    和/或,根据预设的情景判断条件,定义特定的情境主题。
  3. 根据权利要求2所述的方法,其特征在于,所述基于可参考用户的第三用户数据进行数据挖掘,提取出特定的情景主题的步骤,包括:
    对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量;
    基于所述多维词向量,通过主题模型得到所述情景主题。
  4. 根据权利要求3所述的方法,其特征在于,所述对所述可参考用户的第三用户数据进行向量化处理,得到所述第三用户数据对应的多维词向量的步骤,包括:
    对所述可参考用户的第三用户数据进行分词处理;
    去除分词处理后的第三用户数据中的无效词,并提取所述第三用户数据中的特征词,所述无效词包括停用词、高频词中的至少一种;
    基于所述特征词构建所述第三用户数据的多维词向量。
  5. 根据权利要求1-3任一项所述的方法,其特征在于,所述情景主题由情景主题词,和/或在所述情景主题词类别下的主题相关词所表征。
  6. 根据权利要求5所述的方法,其特征在于,所述基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值的步骤,包括:
    根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值;
    基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述目标用户的每条第一用户数据中包含的情景相关词在各个情景主题词下的概率,获取所述第一用户数据针对每个情景主题词的特征值的步骤,包括:
    针对每条第一用户数据,提取所述第一用户数据中的主题相关词;
    针对每个情景主题词,对所述主题相关词在所述情景主题词下的概率进行求和,得到所述第一用户数据针对所述情景主题词的特征值。
  8. 根据权利要求6所述的方法,其特征在于,所述基于所述第一用户数据在各个情景主题词下的特征值,获取所述目标用户对所述情景主题词所表征的情景主题的关联程度值的步骤,包括:
    针对每个情景主题词,获取所述目标用户的全部第一用户数据针对所述情景主题词的特征值和值;
    获取所述特征值和值与所述全部第一用户数据的数量的比值,得到所述目标用户针对所述情景主题词所表征的情景主题的关联程度值。
  9. 根据权利要求6或8所述的方法,其特征在于,所述关联程度值包括短期关联程度值和/或长期关联程度值的加权求和;其中,所述短期关联程度值的权值大于长期关联程度值的权值。
  10. 根据权利要求5所述的方法,其特征在于,所述根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值的步骤,包括:
    针对每个待推荐信息,获取每个可参考用户针对所述待推荐信息的每条第二用户数据;
    针对每个情景主题词,根据所述第二用户数据针对所述情景主题词的特征值,以及所述待推荐信息对应的第二用户数据的数量,获取所述待推荐信息对所述情景主题词所表征的情景主题的匹配程度值。
  11. 根据权利要求1所述的方法,其特征在于,所述基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户的步骤,包括:
    基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度;
    选择与所述目标用户相似度最高的预设数量的待推荐信息作为所述目 标用户的推荐信息,并将所述推荐信息发送至所述目标用户。
  12. 根据权利要求11所述的方法,在所述基于所述关联程度值以及所述匹配程度值,获取所述目标用户与所述待推荐信息之间的相似度的步骤之前,还包括:
    对所述关联程度值和所述匹配程度值进行归一化处理。
  13. 根据权利要求1所述的方法,其特征在于,在所述基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户的步骤之后,还包括:
    根据所述关联程度值确定与所述目标用户匹配的目标情景主题;
    根据所述目标情景主题,展示所述待推荐信息。
  14. 根据权利要求13所述的方法,其特征在于,所述根据所述目标情景主题,展示所述待推荐信息的步骤,包括:
    根据所述目标情景主题,展示所述待推荐信息中与所述目标情景主题相关的推荐信息,所述推荐信息包括推荐理由、图片信息、视频信息、文字信息中的至少一种。
  15. 一种信息推荐装置,其特征在于,包括:
    关联程度确定模块,用于基于目标用户的第一用户数据确定所述目标用户与各个情景主题的关联程度值;
    匹配程度确定模块,用于根据各待推荐信息对应的第二用户数据,确定所述待推荐信息与所述情景主题的匹配程度值;
    推荐信息匹配模块,用于基于所述关联程度值以及所述匹配程度值,确定与所述目标用户匹配的待推荐信息,并将所述待推荐信息发送至所述目标用户。
  16. 一种电子设备,其特征在于,包括:
    处理器、存储器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时实现如权利要求1-14中的任一项所述的信息推荐方法。
  17. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-14中的任一项所述的信息推荐方法的步骤。
PCT/CN2019/125054 2019-01-28 2019-12-13 信息推荐 WO2020155877A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910081528.9A CN109829108B (zh) 2019-01-28 2019-01-28 信息推荐方法、装置、电子设备及可读存储介质
CN201910081528.9 2019-01-28

Publications (1)

Publication Number Publication Date
WO2020155877A1 true WO2020155877A1 (zh) 2020-08-06

Family

ID=66862709

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/125054 WO2020155877A1 (zh) 2019-01-28 2019-12-13 信息推荐

Country Status (2)

Country Link
CN (1) CN109829108B (zh)
WO (1) WO2020155877A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109829108B (zh) * 2019-01-28 2020-12-04 北京三快在线科技有限公司 信息推荐方法、装置、电子设备及可读存储介质
CN110222347B (zh) * 2019-06-20 2020-06-23 首都师范大学 一种作文离题检测方法
CN111010595B (zh) * 2019-12-25 2021-08-24 广州欢聊网络科技有限公司 一种新节目推荐的方法及装置
CN111159393B (zh) * 2019-12-30 2023-10-10 电子科技大学 一种基于lda和d2v进行摘要抽取的文本生成方法
CN111339429B (zh) * 2020-03-27 2022-09-13 上海景域智能科技有限公司 一种资讯推荐方法
CN111506813A (zh) * 2020-04-08 2020-08-07 中国电子科技集团公司第五十四研究所 一种基于用户画像的遥感信息精准推荐方法
CN112269943B (zh) * 2020-12-03 2021-04-02 北京达佳互联信息技术有限公司 一种信息推荐系统及方法
CN116720004B (zh) * 2023-08-09 2023-12-15 腾讯科技(深圳)有限公司 推荐理由生成方法、装置、设备及存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617547A (zh) * 2013-12-04 2014-03-05 中国联合网络通信集团有限公司 一种业务推荐方法及系统
US20150220880A1 (en) * 2014-02-06 2015-08-06 Apollo Education Group, Inc. Suggesting a candidate enrollment item for a candidate student
CN107590209A (zh) * 2017-08-25 2018-01-16 北京点易通科技有限公司 一种基于用户行为场景的个性化推荐方法及系统
CN109829108A (zh) * 2019-01-28 2019-05-31 北京三快在线科技有限公司 信息推荐方法、装置、电子设备及可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102215300B (zh) * 2011-05-24 2013-11-06 中国联合网络通信集团有限公司 电信业务推荐方法和系统
US8645366B1 (en) * 2011-12-30 2014-02-04 Google Inc. Generating recommendations of points of interest
CN102800006B (zh) * 2012-07-23 2016-09-14 姚明东 基于客户购物意图挖掘的实时商品推荐方法
US10042603B2 (en) * 2012-09-20 2018-08-07 Samsung Electronics Co., Ltd. Context aware service provision method and apparatus of user device
CN104008184A (zh) * 2014-06-10 2014-08-27 百度在线网络技术(北京)有限公司 信息的推送方法和装置
CN107944007A (zh) * 2018-02-06 2018-04-20 中山大学 一种结合情境信息的个性化餐厅推荐方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103617547A (zh) * 2013-12-04 2014-03-05 中国联合网络通信集团有限公司 一种业务推荐方法及系统
US20150220880A1 (en) * 2014-02-06 2015-08-06 Apollo Education Group, Inc. Suggesting a candidate enrollment item for a candidate student
CN107590209A (zh) * 2017-08-25 2018-01-16 北京点易通科技有限公司 一种基于用户行为场景的个性化推荐方法及系统
CN109829108A (zh) * 2019-01-28 2019-05-31 北京三快在线科技有限公司 信息推荐方法、装置、电子设备及可读存储介质

Also Published As

Publication number Publication date
CN109829108B (zh) 2020-12-04
CN109829108A (zh) 2019-05-31

Similar Documents

Publication Publication Date Title
WO2020155877A1 (zh) 信息推荐
US9704185B2 (en) Product recommendation using sentiment and semantic analysis
AU2015324030B2 (en) Identifying temporal demand for autocomplete search results
US9310879B2 (en) Methods and systems for displaying web pages based on a user-specific browser history analysis
CN105302810B (zh) 一种信息搜索方法和装置
WO2018192496A1 (zh) 热度信息的生成方法和装置、存储介质以及电子装置
CN104636402B (zh) 一种业务对象的分类、搜索、推送方法和系统
CN107679217B (zh) 基于数据挖掘的关联内容提取方法和装置
WO2016000555A1 (zh) 基于社交网络的内容、新闻推荐方法和系统
CN111159341B (zh) 基于用户投资理财偏好的资讯推荐方法及装置
US9767417B1 (en) Category predictions for user behavior
CN107870984A (zh) 识别搜索词的意图的方法和装置
US9767204B1 (en) Category predictions identifying a search frequency
CN106776707A (zh) 信息推送的方法和装置
CN106452809B (zh) 一种数据处理方法和装置
WO2021217167A1 (en) Messaging system with trend analysis of content
CN113688310A (zh) 一种内容推荐方法、装置、设备及存储介质
US11373057B2 (en) Artificial intelligence driven image retrieval
US20150278907A1 (en) User Inactivity Aware Recommendation System
CN111078849A (zh) 用于输出信息的方法和装置
JP2017219899A (ja) ナレッジ検索装置、ナレッジ検索方法、および、ナレッジ検索プログラム
CN113076450A (zh) 一种目标推荐列表的确定方法和装置
WO2018133680A1 (zh) 一种基于用户已安装应用来推荐热词的方法、装置、终端设备及计算机可读存储介质
CN111797257B (zh) 基于词向量的图片推荐方法及相关设备
CN118537053B (zh) 基于业务中台的电商数据资产管控方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19913038

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19913038

Country of ref document: EP

Kind code of ref document: A1