CN112948678B - Article recall method and system and article recommendation method and system - Google Patents

Article recall method and system and article recommendation method and system Download PDF

Info

Publication number
CN112948678B
CN112948678B CN202110220837.7A CN202110220837A CN112948678B CN 112948678 B CN112948678 B CN 112948678B CN 202110220837 A CN202110220837 A CN 202110220837A CN 112948678 B CN112948678 B CN 112948678B
Authority
CN
China
Prior art keywords
article
importance
determining
under
articles
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110220837.7A
Other languages
Chinese (zh)
Other versions
CN112948678A (en
Inventor
翟丁丁
林苏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Fangjianghu Technology Co Ltd
Original Assignee
Beijing Fangjianghu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Fangjianghu Technology Co Ltd filed Critical Beijing Fangjianghu Technology Co Ltd
Priority to CN202110220837.7A priority Critical patent/CN112948678B/en
Publication of CN112948678A publication Critical patent/CN112948678A/en
Application granted granted Critical
Publication of CN112948678B publication Critical patent/CN112948678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02WCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO WASTEWATER TREATMENT OR WASTE MANAGEMENT
    • Y02W90/00Enabling technologies or technologies with a potential or indirect contribution to greenhouse gas [GHG] emissions mitigation

Abstract

The invention relates to the technical field of data mining, and discloses an article recall method, an article recommendation method and an article recommendation system. Articles are pre-divided into: a plurality of first-level classifications; and a plurality of secondary classifications under each primary classification. The article recall method comprises the following steps: determining the importance of each secondary classification according to the number of articles under each secondary classification, the average quantity of article labels and the article click rate; determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each secondary classification according to importance, diversity proportion parameters and target numbers of articles to be recommended; and recalling the second number of articles before the ranking of the matching score under each of the first number of the second class before the ranking of the importance according to the preset weight of the preset attribute corresponding to the plurality of first class and the set of matching degrees between the articles and the user portrait. The recall strategy in the invention realizes the diversity and accuracy of recall articles at the same time.

Description

Article recall method and system and article recommendation method and system
Technical Field
The invention relates to the technical field of data mining, in particular to an article recall method, an article recall system, an article recommendation method and an article recommendation system.
Background
Currently, there are many methods for improving the diversity of shopping guide articles in the field of shopping guide recommendation. In different business scenarios, there are a wide variety of business rules and solutions. The main solutions include: old expert rules method: manually setting the returning proportion of each category according to experience, and finding a proper set through online experiments; the machine learning model prediction method comprises the following steps: establishing a learning model, constructing related features, and predicting what set is optimal through the model; diversity measurement: the diversity of the different combinations is expressed by constructing an embedding vector. The above methods are widely applied to article recommendation systems. However, in a house shopping guide scenario, the house itself has the characteristics of high purchase price, low frequency, low access frequency and retention rate of users, and the like. If coarse-grained recommendations are made using empirical methods, it is difficult to attract users and retain them. If the machine learning model prediction method is used, a better recommendation effect can be achieved, but the development cost of the method is higher.
Disclosure of Invention
The invention aims to provide an article recall method, an article recommendation method, an article recall system and an article recommendation system, which enable a recommendation strategy to simultaneously consider the diversity and the accuracy of articles by weighting two-stage classification scattering and recommendation modes combined with user portraits, thereby helping users to better position own demands, improving user experience and reducing cost.
In order to achieve the above object, an aspect of the present invention provides an article recall method, in which articles are pre-divided into: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications, the article recall method comprising: determining importance of each secondary classification according to the number of articles under each secondary classification, the average number of article labels and the article click rate; determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity ratio parameters and the target number of articles to be recommended; and recalling a second number of articles before ranking of the matching scores under each of the first number of the secondary classifications before ranking of the importance according to preset weights occupied by preset attributes corresponding to the plurality of the primary classifications and a set of matching degrees between the articles under each of the first number of the secondary classifications before ranking of the importance and the user portraits of the user in response to a query action of the user.
Preferably, said determining the importance of each secondary classification includes: determining the article richness under each secondary classification according to the article quantity under each secondary classification and the average quantity of article labels; and determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification.
Preferably, the determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification includes: and determining the importance of each secondary classification according to the product of the article richness and the article click rate under each secondary classification.
Preferably, the article click rate is determined by the article click rate, the article exposure, and the exposure threshold under each secondary classification.
Preferably, the determining the article click rate by the article click amount, the article exposure amount and the exposure amount threshold value under each secondary classification includes: when the article exposure is greater than or equal to the exposure threshold, the article click rate is the ratio of the article click rate to the article exposure; or in the case that the article exposure is smaller than the exposure threshold, the article click rate is the ratio of the article click rate to the exposure threshold.
Preferably, the determining the first number of secondary classifications before importance ranking to recall and the second number of articles to recall under each of the first number of secondary classifications before importance ranking comprises: determining a first number of secondary classifications before ranking of the importance according to the importance of each secondary classification, the diversity ratio parameter, the target number of articles to be recommended and the number of the plurality of secondary classifications; and determining a second number of articles to be recalled under each of the first number of secondary classifications before the importance ranking according to the target number of articles to be recommended and the first number.
Preferably, said determining the first number of secondary classifications before the importance ranking comprises: according to the diversity ratio parameter a, the target number C of the articles to be recommended, the number N of the secondary classificationsDetermining the first number K; and determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification.
Preferably, the recalling the first number of pre-ranked second number of articles of matching scores under each of the first number of pre-ranked second categories of importance comprises: responding to the inquiry action of a user, and determining a matching score between each article and a user portrait under each of the first number of secondary classifications before the importance ranking according to preset weights occupied by preset attributes corresponding to the plurality of primary classifications and a matching degree set between the article and the user portrait of the user under each of the first number of secondary classifications before the importance ranking; and recalling a second number of articles before ranking the matching score under each of the first number of secondary classifications before ranking the importance according to the matching score.
Preferably, the determining a match score between each article and a user representation under each of the first number of secondary classifications before the importance ranking comprises: according to the preset weight of the preset attribute j corresponding to the first class classification j j A degree of match m between a preset attribute j in article i and a preset attribute j in the user representation under each of the first number of secondary classifications before the importance ranking ij And determining a matching score between a preset attribute j in the article i and a preset attribute j in the user portrait ij ,score ij =k+wt j *m ij The method comprises the steps of carrying out a first treatment on the surface of the Score based on the match score ij And calculating a match score Score between the article i and the user representation under each of a first number of the secondary classifications before the importance ranking,wherein k is a constant; and T is the total number of preset attributes in the article i.
Preferably, the degree of matching between the preset attribute j in the article i and the preset attribute j in the user portrait is determined by: under the condition that the preset attribute j in the article i is matched with the preset attribute j in the user portrait, determining that the matching degree is 1; or if the preset attribute j in the article i is not matched with the preset attribute j in the user portrait, determining that the matching degree is 0.
Through the technical scheme, the importance of each secondary classification is creatively determined through the number of the articles, the average number of the article labels and the article click rate; then determining a first number of secondary classifications before importance ranking and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity ratio parameters and the target number of articles to be recommended; finally, recall the second number of articles with the matched scores before ranking from each of the first number of the second class classifications before ranking the importance degree through the importance degree of the preset attribute corresponding to the plurality of first class classifications and the matching condition between the articles and the user portrait.
The second aspect of the present invention provides an article recommendation method, including: based on the article recall method, recall a second number of articles before ranking the matching scores under each of the first number of secondary classifications before ranking the importance; screening a target number of articles from the second number of articles before the ranking of the matching scores under each of the first number of secondary classifications before the ranking of the recalled importance by adopting a weighted fusion method; and recommending the target number of articles to the user.
Through the technical scheme, the article recall method is creatively based on the article recall method, the first number of articles before the ranking of the matching scores under each of the first number of the second classes before the ranking of the importance degree are recalled, and then the second number of articles before the ranking of the matching scores under each of the second classes are recommended.
The third aspect of the present invention provides an article recall system, wherein the articles are pre-divided into: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications, the article recall system comprising: the first determining device is used for determining the importance of each secondary classification according to the number of articles, the average number of article labels and the article click rate under each secondary classification; the second determining device is used for determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity proportion parameter and the target number of articles to be recommended; and recall means for recalling, in response to a query action by a user, a second number of articles before ranking of the matching scores under each of the first number of secondary classifications before ranking of the importance, based on preset weights occupied by preset attributes corresponding to the plurality of primary classifications and a set of matches between articles under each of the first number of secondary classifications before ranking of the importance and user portraits of the user.
Preferably, the first determining means includes: the richness determining module is used for determining the richness of the articles under each secondary classification according to the number of the articles under each secondary classification and the average number of the article labels; and the importance determining module is used for determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification.
Preferably, the importance determining module is configured to determine the importance of each secondary classification according to the article richness and the article click rate under each secondary classification, where the determining includes: and determining the importance of each secondary classification according to the product of the article richness and the article click rate under each secondary classification.
Preferably, the article recall system further comprises: and the third determining device is used for determining the article click rate according to the article click quantity, the article exposure quantity and the exposure quantity threshold value under each secondary classification.
Preferably, the third determining means is configured to determine the article click rate includes: when the article exposure is greater than or equal to the exposure threshold, the article click rate is the ratio of the article click rate to the article exposure; or in the case that the article exposure is smaller than the exposure threshold, the article click rate is the ratio of the article click rate to the exposure threshold.
Preferably, the second determining means includes: the ranking determining module is used for determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification, the diversity ratio parameter, the target number of articles to be recommended and the number of the plurality of secondary classifications; and a quantity determination module for determining a second number of articles to be recalled under each of the first number of the secondary classifications before the importance ranking according to the target number of articles to be recommended and the first number.
Preferably, the ranking determining module comprises: a classification quantity determining unit for determining the number N of the secondary classifications according to the diversity ratio parameter a, the target number C of the articles to be recommended, the number N of the secondary classifications anddetermining the first number; and a ranking determining unit for determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification.
Preferably, the recall device comprises: the score determining module is used for responding to the query action of the user and determining a matching score between each article and the user portrait under each of the first number of the secondary classifications before the importance ranking according to the preset weight occupied by the preset attribute corresponding to the plurality of the primary classifications and the set of matching degrees between the articles and the user portrait under each of the first number of the secondary classifications before the importance ranking; and a recall module for recalling a second number of articles before ranking of the match scores under each of the first number of secondary classifications before ranking of the importance according to the match scores.
Preferably, the score determining module includes: the matching score determining unit is used for determining the preset weight occupied by the preset attribute j corresponding to the first-level classification j j A degree of match m between a preset attribute j in article i and a preset attribute j in the user representation under each of the first number of secondary classifications before the importance ranking ij And determining a matching score between a preset attribute j in the article i and a preset attribute j in the user portrait ij ,score ij =k+wt j *m ij The method comprises the steps of carrying out a first treatment on the surface of the And a score determining unit for determining a score according to the matching score ij And calculating a match score Score between the article i and the user representation under each of a first number of the secondary classifications before the importance ranking,wherein k is a constant; and T is the total number of preset attributes in the article i.
Preferably, the score determining module further comprises: a matching degree determining unit, configured to determine a matching degree between a preset attribute j in the article i and a preset attribute j in the user portrait by: under the condition that the preset attribute j in the article i is matched with the preset attribute j in the user portrait, determining that the matching degree is 1; or if the preset attribute j in the article i is not matched with the preset attribute j in the user portrait, determining that the matching degree is 0.
Specific details and benefits of the article recall system provided by the present invention can be found in the above description of the article recall method, and are not repeated here.
A fourth aspect of the present invention provides an article recommendation system, including: the article recall system is used for recalling a second number of articles with a first number of matching scores under each of the first number of secondary classifications before ranking importance; and screening means for screening a target number of articles from the pre-ranked second number of articles for matching scores under each of the pre-ranked first number of secondary classifications of importance recalled using a weighted fusion method; and recommending means for recommending the target number of articles to the user.
The specific details and benefits of the article recommendation system provided in the present invention can be referred to the description of the article recommendation method, and are not repeated herein.
A fifth aspect of the present invention provides a machine-readable storage medium having instructions stored thereon for causing a machine to perform the above-described article recall method and the above-described article recommendation method.
A sixth aspect of the present invention provides an electronic apparatus, comprising: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instruction from the memory, and execute the instruction to implement the above-mentioned article recall method and the above-mentioned article recommendation method.
Additional features and advantages of the invention will be set forth in the detailed description which follows.
Drawings
The accompanying drawings are included to provide a further understanding of embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain, without limitation, the embodiments of the invention. In the drawings:
FIG. 1 is a flow chart of an article recall method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of determining the importance of each secondary classification provided by an embodiment of the invention;
FIG. 3 is a flow chart of determining a top K secondary importance ranking category and the number M of articles to recall under each secondary ranking category according to one embodiment of the present invention;
FIG. 4 is a flow chart of matching score top M articles under each of the top K secondary classifications of recall importance provided by one embodiment of the present invention;
FIG. 5 is a flow chart providing a method for determining a match score between each article and a user representation under each of the top-importance K secondary classifications in accordance with an embodiment of the present invention; and
fig. 6 is a schematic diagram of an article recommendation process according to an embodiment of the present invention.
Detailed Description
The following describes specific embodiments of the present invention in detail with reference to the drawings. It should be understood that the detailed description and specific examples, while indicating and illustrating the invention, are not intended to limit the invention.
FIG. 1 is a flow chart of an article recall method according to an embodiment of the present invention. Wherein, the articles are divided into in advance according to preset attributes: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications. For example, articles in the same city dimension may be categorized by means of Exploratory Data (EDA) analysis, as shown in FIG. 6. Specifically, the content of the article can be classified into a first class classification of business circles, urban areas, cells, prices, living rooms, areas and the like; each primary category may then be further divided into finer categories, e.g., the primary category of prices may be divided into secondary categories such as different price intervals (or different price reduction intervals).
As shown in FIG. 1, the article recall method may include the following steps S101-S103.
And step S101, determining the importance of each secondary classification according to the number of articles under each secondary classification, the average number of article labels and the article click rate.
For step S101, the determining the importance of each secondary classification may include the following steps S201-S202, as shown in FIG. 2.
Step S201, determining the article richness under each secondary classification according to the number of articles and the average number of article labels under each secondary classification.
The article tag may refer to a feature tag of the article content, for example, a business district, an urban area, a district, a price, a living room, a school district, a near subway, and the like.
The article number of articles under the secondary classification j can be passed cntj Average number avg (tag) cntj ) And the following formula (1), the article richTag under each secondary classification j
richTag j =article cntj *avg(tag cntj ), (1)
Wherein the average number avg (tag cntj ) And calculating the average value of the number of article labels of each article under the secondary classification j.
Step S202, determining importance of each secondary classification according to the article richness and the article click rate under each secondary classification.
For step S202, determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification may include: and determining the importance degree of the secondary classification j according to the product of the article richness and the article click rate under the secondary classification j.
Specifically, the article richTag under the secondary classification j can be used for j Click rate with articleAnd the following formula (2), calculate the importance (i.e. importance weight) weight of the secondary classification j j
The article click rate is determined by the article click quantity, the article exposure quantity and the exposure quantity threshold value under each secondary classification.
Specifically, determining the article click rate from the article click rate, the article exposure, and the exposure threshold under each secondary classification may include: when the article exposure is greater than or equal to the exposure threshold, the article click rate is the ratio of the article click rate to the article exposure; or in the case that the article exposure is smaller than the exposure threshold, the article click rate is the ratio of the article click rate to the exposure threshold.
For example, click based on article click times under the secondary classification j cntj And the exposure times expo cntj And the ten-digit threshold value of the exposure, namely the threshold value of each secondary classification can be equal, and the following formula (3), the article click rate under each secondary classification is calculated
That is, a smooth calculation is introduced for some long-tail, low-frequency article classifications, so that the articles in the part have the probability of exposure (the probability of exposure of different classifications can be interpreted and controlled), and the Martai effect in recommendation is avoided.
The step S101 integrates the indexes such as the article richness and the click rate, and designs different weights for the secondary classification of the articles, thereby avoiding the homogenization of the display content.
Step S102, determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity ratio parameters and the target number of articles to be recommended.
For step S102, the determining a first number of pre-importance ranked secondary classifications for recall and a second number of articles to recall under each of the first number of pre-importance ranked secondary classifications may include the following steps S301-S302, as shown in FIG. 3.
Step S301, determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification, the diversity ratio parameter, the target number of articles to be recommended, and the number of the plurality of secondary classifications.
For step S301, the determining the first number of secondary classifications before importance ranking may include: according to the diversity ratio parameter a, the target number C of the articles to be recommended, the number N of the secondary classifications Determining the first number K, wherein a ε (0, 1); and determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification.
Taking the case of classifying articles in the same city dimension (for example, city a) as an example, N refers to the number of secondary classifications in the city a; c is the target number of articles to be recommended; a is an adjustable threshold (the larger a the requirement for diversity is, the more classified articles can be controlled by adjusting a size when C is 100, for example). Therefore, different K values can be adopted according to article distribution in different cities and the current scene display condition: when N is less than a×c, k=n; or when N is greater than a x C,for example, if n=40, a=0.5, k=40. After determining the value of K, a top K secondary classification (i.e., topK secondary classification) may be selected based on the importance of each secondary classification determined by step S101. That is, the greater the importance of the secondary classification, the greater the probability that it will be selected.
Step S302, determining a second number of articles to be recalled under each of the first number of secondary classifications before the importance ranking according to the target number of articles to be recommended and the first number.
Under the first number K of secondary classifications selected, the articles under each secondary classification are recalled. The second number M of articles to recall under each secondary category may be calculated by the following equation (4):
that is, the C/K is subjected to the rounding-up process (for example, c=100, k=40, then m=3). I.e., the articles to be recalled under each secondary category are equal (i.e., 3 articles) to ensure uniform exposure of the secondary category.
Step S103, responding to the inquiry action of the user, and recalling a second number of articles before ranking according to the preset weight occupied by the preset attribute corresponding to the first number of the secondary classifications before ranking and the matching degree set between the articles under each of the first number of the secondary classifications before ranking and the user portrait of the user.
Wherein, the user portrait is composed of a set of attribute values obtained by the user taking values of each preset attribute of the article (prior content which can be obtained in advance through other channels such as user search). Specifically, for example: the user u1 likes two rooms with prices of two flags of one west in the sea area within 500 ten thousand, and the user images are as follows: { "urban": sea lake, "business district": western two flag, "price": (0,500), "Living room": 2}, and user u2 likes the student district room with a price of less than 500 ten thousand in the western district, and the user's portrait is { "urban district": western district, "price": (0,500), "tag": student district }.
The mapping of each user's user profile and its user identification may be stored in a storage device. For example, for user u1 (u 1 is the identity of that user), the following { user u1: the format of user portrayal p1 is stored. In response to a query action by a user, a user image of the user may be queried from the storage device according to a user identification.
For step S103, the recall of the second pre-ranked number of articles of matching scores under each of the first pre-ranked number of importance secondary classifications may include the following steps S401-S402, as shown in FIG. 4.
Step S401, determining a matching score between each article and the user portrait under each of the first number of secondary classifications before the importance ranking according to the preset weight occupied by the preset attribute corresponding to the plurality of primary classifications and the matching degree set between each article and the user portrait under each of the first number of secondary classifications before the importance ranking.
Wherein the set of matches may be a set of matches between respective preset attributes in articles under each of a first number of secondary classifications before the importance ranking and corresponding preset attributes in the user representation. For example, in the K secondary classifications selected, for preset attributes { "urban", "business", "price", "living room" } and attribute values thereof in the article of a certain secondary classification are { "urban": sea lake, "business district": western two flag, "price": (0,500 ]"Living room": 2} if the preset attribute "urban area" and its attribute value and "urban area" in the article of the secondary classification: the matching degree of the sea lake is m 1 The attribute "business district" and the attribute value and "business district" are preset: the matching degree of the two Western flags is m 2 The attribute "price" and the attribute value and "price" are preset: (0,500]Is m 3 The attribute "living room" and the attribute value and "living room" are preset: 2 is m 4 The matching degree set is { m } 1 ,m 2 ,m 3 ,m 4 }。
For step S401, the determining a match score between each article under each of the first number of secondary classifications before the importance ranking and the user representation includes the following steps S501-S502, as shown in FIG. 5.
Step S501, according to the preset weight wt occupied by the preset attribute j corresponding to the first class classification j j Text under each of a first number of secondary classifications before the importance rankingMatching degree m between preset attribute j in chapter i and preset attribute j in the user portrait ij And determining a matching score between a preset attribute j in the article i and a preset attribute j in the user portrait ij
score ij =k+wt j *m ij
Where k is a constant and may be, for example, 1 (formula (5) below). The degree of match between a preset attribute j in article i and a preset attribute j in the user representation under each of the first number of secondary classifications before the importance ranking may be determined by: under the condition that the preset attribute j in the article i is matched with the preset attribute j in the user portrait, determining that the matching degree is 1; or if the preset attribute j in the article i is not matched with the preset attribute j in the user portrait, determining that the matching degree is 0.
Then, according to the matching degree between the preset attribute j in the article i and the preset attribute j in the user portrait, the preset weight occupied by the preset attribute j j And the following formula (5) determining a matching score between the article i and the user portrait under each secondary classification for the preset attribute j ij
According to business experience, weights corresponding to different dimensions (i.e. different preset attributes) of the articles can be defined, for example, business circles: 0.15; urban areas: 0.25; cell: 0.15; price: 0.25; living room: 0.1; area: 0.1. if the attribute of the article can be matched with the preference in the user portrait, adding the weight of the attribute on the basis of 1 minute; otherwise, default is set to 1 minute.
Step S502, according to the matching score ij And the following formula (6), calculate the match Score between the article i and the user representation under each of the first number of secondary classifications before the importance ranking i
Wherein T is the total number of preset attributes in the article i, and if the preset attributes in the article i include business circles, urban areas, communities, prices, living rooms and areas, t=6.
That is, the final matching score is calculated by comprehensively considering the matching degree of the article and the user portrait in each dimension in step S502. Therefore, under the first K secondary classifications, articles which are matched with the user preference best can be recalled under each secondary classification based on the user portrait, so that the diversity of shopping guide contents can be ensured on the premise of being consistent with the user preference.
Step S402 recalls a second number of articles before ranking the matching score under each of the first number of secondary classifications before ranking the importance according to the matching score.
For each of the top K-ranking secondary classifications, a Score may be based on a match Score between each article i and the user representation i The articles are sorted in a descending order; the match scores in the recall ranking then belong to the TopM article.
Specifically, as shown in fig. 6, in the above embodiments, the importance (importance weight) of the secondary classification is first determined according to the number of articles, the average number of article tags (or article tags), and the click rate; and then a weighted secondary classification recall strategy is provided (according to the importance of each secondary classification, the secondary classification of the first number K before the importance ranking and the second number M of articles to be recalled under each of the secondary classifications of the first number K before the importance ranking are determined, and the articles of the corresponding number are recalled), and the idea of scattering is introduced, so that the articles under each secondary classification have recall possibility, and the fairness and diversity of the article exposure are ensured. Meanwhile, in the embodiment, the dimensions of the articles under the secondary classification are considered, the weight values are designated for the dimensions, and the weight values are introduced into the matching score calculation according to the interaction between the user image and the article classification, so that the articles are recalled according to the matching score, the correlation between the recalled articles and the user preference is improved, and the recommendation accuracy can be further improved.
In summary, the importance of each secondary classification is creatively determined by the number of the articles, the average number of the article labels and the article click rate; then determining a first number of secondary classifications before importance ranking and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity ratio parameters and the target number of articles to be recommended; finally, recall matching scores of the second number of articles before ranking are sent from each of the first number of secondary classifications before ranking according to the importance of the preset attribute corresponding to the plurality of primary classifications and the matching condition between the articles and the user portrait, so that the recall policy can consider the diversity and the accuracy of recall articles at the same time by weighting recall modes of scattering the secondary classifications and combining the user portrait.
The embodiment of the invention also provides an article recommending method, which can comprise the following steps: based on the article recall method, recall a second number of articles before ranking the matching scores under each of the first number of secondary classifications before ranking the importance; screening a target number of articles from the second number of articles before the ranking of the matching scores under each of the first number of secondary classifications before the ranking of the recalled importance by adopting a weighted fusion method; and recommending the target number of articles to the user.
That is, after recalling K (e.g., 40) queues of articles (e.g., each queue including 3 articles), the queues are weighted and fused to filter out a target number (e.g., 100) of articles for recommendation to the user. The weighted fusion method in this embodiment may refer to the existing weighted fusion method, and will not be described herein.
In summary, the method for recalling articles creatively is based on the method for recalling articles, recalls the second number of articles before ranking the matching score under each of the first number of the secondary classifications before ranking the importance, then adopts a weighted fusion method to screen the target number of articles from the second number of articles before ranking the matching score under each of the recalled first number of the secondary classifications before ranking, and finally enables the target number of articles to be improved on the premise of ensuring matching with user preferences.
The embodiment of the invention also provides an article recall system, wherein the articles are divided into the following groups in advance according to preset attributes: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications, the article recall system comprising: the first determining device is used for determining the importance of each secondary classification according to the number of articles, the average number of article labels and the article click rate under each secondary classification; the second determining device is used for determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity proportion parameter and the target number of articles to be recommended; and recall means for recalling, in response to a query action by a user, a second number of articles before ranking of the matching scores under each of the first number of secondary classifications before ranking of the importance, based on preset weights occupied by preset attributes corresponding to the plurality of primary classifications and a set of matches between articles under each of the first number of secondary classifications before ranking of the importance and user portraits of the user.
Specific details and benefits of the article recall system provided by the present invention can be found in the above description of the article recall method, and are not repeated here.
The embodiment of the invention also provides an article recommendation system, which comprises: the article recall system is used for recalling a second number of articles with a first number of matching scores under each of the first number of secondary classifications before ranking importance; and screening means for screening a target number of articles from the pre-ranked second number of articles for matching scores under each of the pre-ranked first number of secondary classifications of importance recalled using a weighted fusion method; and recommending means for recommending the target number of articles to the user.
The specific details and benefits of the article recommendation system provided in the present invention can be referred to the description of the article recommendation method, and are not repeated herein.
An embodiment of the present invention further provides a machine-readable storage medium having instructions stored thereon for causing a machine to perform the above-described article recall method and the above-described article recommendation method.
An embodiment of the present invention further provides an electronic device, including: a processor; a memory for storing the processor-executable instructions; the processor is configured to read the executable instruction from the memory, and execute the instruction to implement the above-mentioned article recall method and the above-mentioned article recommendation method.
The foregoing details of the optional implementation of the embodiment of the present invention have been described in detail with reference to the accompanying drawings, but the embodiment of the present invention is not limited to the specific details of the foregoing implementation, and various simple modifications may be made to the technical solution of the embodiment of the present invention within the scope of the technical concept of the embodiment of the present invention, and these simple modifications all fall within the protection scope of the embodiment of the present invention.
In addition, the specific features described in the above embodiments may be combined in any suitable manner without contradiction. In order to avoid unnecessary repetition, various possible combinations of embodiments of the present invention are not described in detail.
Those skilled in the art will appreciate that all or part of the steps in implementing the methods of the embodiments described above may be implemented by a program stored in a storage medium, including instructions for causing a single-chip microcomputer, chip or processor (processor) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In addition, any combination of various embodiments of the present invention may be performed, so long as the concept of the embodiments of the present invention is not violated, and the disclosure of the embodiments of the present invention should also be considered.

Claims (22)

1. An article recall method is characterized in that articles are divided into: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications, the article recall method comprising:
determining importance of each secondary classification according to the number of articles under each secondary classification, the average number of article labels and the article click rate;
determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity ratio parameters and the target number of articles to be recommended; and
responding to the inquiry action of the user, recalling the matching score of each of the first number of the secondary classifications before the importance ranking according to the preset weight occupied by the preset attribute corresponding to the plurality of the primary classifications and the matching degree set between the articles of each of the first number of the secondary classifications before the importance ranking and the user portrait of the user,
Wherein the determining a first number of secondary classifications before importance ranking to recall and a second number of articles to recall under each of the first number of secondary classifications before importance ranking comprises:
determining a first number of secondary classifications before ranking of the importance according to the importance of each secondary classification, the diversity ratio parameter, the target number of articles to be recommended and the number of the plurality of secondary classifications; and
and determining a second number of articles to be recalled under each of the first number of secondary classifications before the importance ranking according to the target number of articles to be recommended and the first number.
2. The article recall method of claim 1 wherein the determining the importance of each secondary category comprises:
determining the article richness under each secondary classification according to the article quantity under each secondary classification and the average quantity of article labels; and
and determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification.
3. The article recall method of claim 2 wherein determining the importance of each secondary category based on the article richness and the article click rate under each secondary category comprises:
And determining the importance of each secondary classification according to the product of the article richness and the article click rate under each secondary classification.
4. The article recall method of claim 1 wherein the article click rate is determined by the article click rate, article exposure, and exposure threshold under each of the two-level classifications.
5. The article recall method of claim 4 wherein the article click rate is determined by the article click rate, article exposure, and exposure threshold under each of the two-level classifications comprises:
when the article exposure is greater than or equal to the exposure threshold, the article click rate is the ratio of the article click rate to the article exposure; or alternatively
And under the condition that the article exposure is smaller than the exposure threshold, the article click rate is the ratio of the article click rate to the exposure threshold.
6. The article recall method of claim 1 wherein the determining the first number of secondary classifications before importance ranking comprises:
according to the diversity ratio parameter a, the target number C of the articles to be recommended, the number N of the secondary classifications Determining the first number K; and
and determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification.
7. The article recall method of claim 1 wherein the recalling the second number of articles before ranking the matching score under each of the first number of two-level classifications before ranking the importance comprises:
determining a matching score between each article and the user portrait under each of the first number of secondary classifications before the importance ranking according to a preset weight occupied by a preset attribute corresponding to the plurality of primary classifications and a set of matching degrees between the article and the user portrait under each of the first number of secondary classifications before the importance ranking; and
based on the match scores, recall a second number of articles before the match scores under each of the first number of secondary classifications before the importance ranking.
8. The article recall method of claim 7 wherein the determining a match score between each article under each of the first number of secondary classifications before the importance ranking and the user representation comprises:
According to the preset weight occupied by the preset attribute j corresponding to the first class classification jwt j A degree of match m between a preset attribute j in article i and a preset attribute j in the user representation under each of the first number of secondary classifications before the importance ranking ij And determining a matching score between a preset attribute j in the article i and a preset attribute j in the user portrait ij
score ij =k+wt j *m ij The method comprises the steps of carrying out a first treatment on the surface of the And
score according to the matching score ij And calculating a match Score between the article i and the user representation under each of a first number of the secondary classifications before the importance ranking i
Wherein k is a constant; and T is the total number of preset attributes in the article i.
9. The article recall method of claim 8 wherein the degree of match between a preset attribute j in the article i and a preset attribute j in the user representation is determined by:
under the condition that the preset attribute j in the article i is matched with the preset attribute j in the user portrait, determining that the matching degree is 1; or alternatively
And determining that the matching degree is 0 under the condition that the preset attribute j in the article i is not matched with the preset attribute j in the user portrait.
10. An article recommendation method, characterized in that the article recommendation method comprises:
the article recall method of any one of claims 1-9 based on recall scores of matches under each of a first number of the secondary classifications before recall importance ranking a second number of articles before ranking;
screening a target number of articles from the second number of articles before the ranking of the matching scores under each of the first number of secondary classifications before the ranking of the recalled importance by adopting a weighted fusion method; and
and recommending the target number of articles to the user.
11. An article recall system, wherein the articles are pre-divided into: a plurality of first-level classifications; and a plurality of secondary classifications under each of the primary classifications, the article recall system comprising:
the first determining device is used for determining the importance of each secondary classification according to the number of articles, the average number of article labels and the article click rate under each secondary classification;
the second determining device is used for determining a first number of secondary classifications before importance ranking to be recalled and a second number of articles to be recalled under each of the first number of secondary classifications before importance ranking according to the importance of each secondary classification, the diversity proportion parameter and the target number of articles to be recommended; and
Recall means for recalling, in response to a query action by a user, a second number of articles before ranking of the matching scores under each of the first number of secondary classifications before ranking of importance, based on preset weights occupied by preset attributes corresponding to the plurality of primary classifications and a set of matches between the articles under each of the first number of secondary classifications before ranking of importance and a representation of the user,
wherein the second determining means includes:
the ranking determining module is used for determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification, the diversity ratio parameter, the target number of articles to be recommended and the number of the plurality of secondary classifications; and
and the quantity determining module is used for determining a second number of articles to be recalled under each of the first number of secondary classifications before the importance ranking according to the target number of articles to be recommended and the first number.
12. The article recall system of claim 11 wherein the first determining means comprises:
the richness determining module is used for determining the richness of the articles under each secondary classification according to the number of the articles under each secondary classification and the average number of the article labels; and
And the importance determining module is used for determining the importance of each secondary classification according to the article richness and the article click rate under each secondary classification.
13. The article recall system of claim 12 wherein the importance determination module is configured to determine the importance of each secondary category based on the article richness and the article click rate under each secondary category comprising:
and determining the importance of each secondary classification according to the product of the article richness and the article click rate under each secondary classification.
14. The article recall system of claim 11 wherein the article recall system further comprises: and the third determining device is used for determining the article click rate according to the article click quantity, the article exposure quantity and the exposure quantity threshold value under each secondary classification.
15. The article recall system of claim 14 wherein the third means for determining the article click rate comprises:
when the article exposure is greater than or equal to the exposure threshold, the article click rate is the ratio of the article click rate to the article exposure; or alternatively
And under the condition that the article exposure is smaller than the exposure threshold, the article click rate is the ratio of the article click rate to the exposure threshold.
16. The article recall system of claim 11 wherein the ranking determination module comprises:
a classification quantity determining unit for determining the number N of the secondary classifications according to the diversity ratio parameter a, the target number C of the articles to be recommended, the number N of the secondary classifications anddetermining the first number; and
and the ranking determining unit is used for determining a first number of secondary classifications before ranking the importance according to the importance of each secondary classification.
17. The article recall system of claim 11 wherein the recall means comprises:
the score determining module is used for determining a matching score between each article and the user portrait under each of the first number of the secondary classifications before the importance ranking according to preset weights occupied by preset attributes corresponding to the plurality of the primary classifications and a matching degree set between each article and the user portrait under each of the first number of the secondary classifications before the importance ranking; and
And a recall module for recalling a second number of articles before ranking of the matching scores under each of the first number of secondary classifications before ranking of the importance according to the matching scores.
18. The article recall system of claim 17 wherein the score determination module comprises:
the matching score determining unit is used for determining the preset weight occupied by the preset attribute j corresponding to the first-level classification j j A degree of match m between a preset attribute j in article i and a preset attribute j in the user representation under each of the first number of secondary classifications before the importance ranking ij And determining a matching score between a preset attribute j in the article i and a preset attribute j in the user portrait ij
score ij =k+wt j *m ij The method comprises the steps of carrying out a first treatment on the surface of the And
a score determining unit for determining a score according to the matching score ij And calculating a match Score between the article i and the user representation under each of a first number of the secondary classifications before the importance ranking i
Wherein k is a constant; and T is the total number of preset attributes in the article i.
19. The article recall system of claim 18 wherein the score determination module further comprises: a matching degree determining unit, configured to determine a matching degree between a preset attribute j in the article i and a preset attribute j in the user portrait by:
Under the condition that the preset attribute j in the article i is matched with the preset attribute j in the user portrait, determining that the matching degree is 1; or alternatively
And determining that the matching degree is 0 under the condition that the preset attribute j in the article i is not matched with the preset attribute j in the user portrait.
20. An article recommendation system, comprising:
the article recall system of any one of claims 11-19 for recalling a second number of articles before ranking the matching score under each of a first number of the secondary classifications before ranking the importance; and
screening means for screening a target number of articles from a second number of articles before ranking of matching scores under each of the first number of secondary classifications before ranking of the recalled importance by using a weighted fusion method; and
and the recommending device is used for recommending the target number of articles to the user.
21. A machine-readable storage medium having instructions stored thereon for causing a machine to perform the article recall method of any one of the preceding claims 1-9 and the article recommendation method of claim 10.
22. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the article recall method of any one of the above claims 1-9 and the article recommendation method of claim 10.
CN202110220837.7A 2021-02-26 2021-02-26 Article recall method and system and article recommendation method and system Active CN112948678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110220837.7A CN112948678B (en) 2021-02-26 2021-02-26 Article recall method and system and article recommendation method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110220837.7A CN112948678B (en) 2021-02-26 2021-02-26 Article recall method and system and article recommendation method and system

Publications (2)

Publication Number Publication Date
CN112948678A CN112948678A (en) 2021-06-11
CN112948678B true CN112948678B (en) 2023-07-21

Family

ID=76246669

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110220837.7A Active CN112948678B (en) 2021-02-26 2021-02-26 Article recall method and system and article recommendation method and system

Country Status (1)

Country Link
CN (1) CN112948678B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115033782B (en) * 2022-05-18 2023-03-28 百度在线网络技术(北京)有限公司 Object recommendation method, training method, device and equipment of machine learning model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102346899A (en) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 Method and device for predicting advertisement click rate based on user behaviors
CN105868237A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Multimedia data recommendation method and server
CN106339507A (en) * 2016-10-31 2017-01-18 腾讯科技(深圳)有限公司 Method and device for pushing streaming media message
CN110347781A (en) * 2019-07-18 2019-10-18 腾讯科技(深圳)有限公司 Article falls discharge method, article recommended method, device, equipment and storage medium
CN111310040A (en) * 2020-02-11 2020-06-19 腾讯科技(北京)有限公司 Artificial intelligence based recommendation method and device, electronic equipment and storage medium
CN111708888A (en) * 2020-06-16 2020-09-25 腾讯科技(深圳)有限公司 Artificial intelligence based classification method, device, terminal and storage medium
CN112231555A (en) * 2020-10-12 2021-01-15 中国平安人寿保险股份有限公司 Recall method, apparatus, device and storage medium based on user portrait label

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866496B (en) * 2014-02-22 2019-12-10 腾讯科技(深圳)有限公司 method and device for determining morpheme importance analysis model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102346899A (en) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 Method and device for predicting advertisement click rate based on user behaviors
CN105868237A (en) * 2015-12-09 2016-08-17 乐视网信息技术(北京)股份有限公司 Multimedia data recommendation method and server
CN106339507A (en) * 2016-10-31 2017-01-18 腾讯科技(深圳)有限公司 Method and device for pushing streaming media message
CN110347781A (en) * 2019-07-18 2019-10-18 腾讯科技(深圳)有限公司 Article falls discharge method, article recommended method, device, equipment and storage medium
CN111310040A (en) * 2020-02-11 2020-06-19 腾讯科技(北京)有限公司 Artificial intelligence based recommendation method and device, electronic equipment and storage medium
CN111708888A (en) * 2020-06-16 2020-09-25 腾讯科技(深圳)有限公司 Artificial intelligence based classification method, device, terminal and storage medium
CN112231555A (en) * 2020-10-12 2021-01-15 中国平安人寿保险股份有限公司 Recall method, apparatus, device and storage medium based on user portrait label

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
User sequential behavior classification for click through rate prediction;Jiangwei Zeng等;《International conference on database system for advanced application》;267-280 *
基于混合推荐策略的学术会议推荐系统的研究与实现;徐傲雪;《中国优秀硕士学位论文全文数据库 基础科学辑》;A001-10 *

Also Published As

Publication number Publication date
CN112948678A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN110573837B (en) Navigation method, navigation device, storage medium and server
CN108737859A (en) Video recommendation method based on barrage and device
CN111241992B (en) Face recognition model construction method, recognition method, device, equipment and storage medium
CN111275492A (en) User portrait generation method, device, storage medium and equipment
CN112785005A (en) Multi-target task assistant decision-making method and device, computer equipment and medium
CN113407854A (en) Application recommendation method, device and equipment and computer readable storage medium
CN112948678B (en) Article recall method and system and article recommendation method and system
CN112132634A (en) Virtual gift resource distribution method and device, computer equipment and storage medium
CN111444930B (en) Method and device for determining prediction effect of two-classification model
CN112148994B (en) Information push effect evaluation method and device, electronic equipment and storage medium
CN112381236A (en) Data processing method, device, equipment and storage medium for federal transfer learning
CN107609570A (en) Micro- video popularity Forecasting Methodology based on attributive classification and various visual angles Fusion Features
CN112364258B (en) Recommendation method and system based on map, storage medium and electronic equipment
CN115018608A (en) Risk prediction method and device and computer equipment
CN113487389A (en) Information recommendation method and device
CN112418442A (en) Data processing method, device, equipment and storage medium for federal transfer learning
JP7043243B2 (en) Classification device, classification method, and program
CN113762579A (en) Model training method and device, computer storage medium and equipment
CN106528584A (en) An ensemble learning-based group recommendation method
Frank et al. Mining permission request patterns from android and facebook applications (extended author version)
CN117708437B (en) Recommendation method and device for personalized content, electronic equipment and storage medium
CN116186417B (en) Recommendation method, recommendation device, computer equipment and storage medium
VJ Using Heterogeneous Social Media As Auxiliary Information to Improve Car Recommendation Performance
CN115131025A (en) User type identification method and device, computer equipment and storage medium
CN116975613A (en) Recommendation information identification method, device, equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant