CN112801760A - Sequencing optimization method and system of content personalized recommendation system - Google Patents
Sequencing optimization method and system of content personalized recommendation system Download PDFInfo
- Publication number
- CN112801760A CN112801760A CN202110338178.7A CN202110338178A CN112801760A CN 112801760 A CN112801760 A CN 112801760A CN 202110338178 A CN202110338178 A CN 202110338178A CN 112801760 A CN112801760 A CN 112801760A
- Authority
- CN
- China
- Prior art keywords
- content
- user
- vector
- sorting
- recall
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000012163 sequencing technique Methods 0.000 title claims abstract description 53
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000005457 optimization Methods 0.000 title claims abstract description 26
- 239000013598 vector Substances 0.000 claims abstract description 90
- 238000005070 sampling Methods 0.000 claims abstract description 16
- 238000012216 screening Methods 0.000 claims abstract description 16
- 230000003044 adaptive effect Effects 0.000 claims abstract description 4
- 238000004364 calculation method Methods 0.000 claims description 17
- 239000000047 product Substances 0.000 claims description 16
- 238000012545 processing Methods 0.000 claims description 9
- 239000013589 supplement Substances 0.000 claims description 4
- 238000012549 training Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 3
- 238000011144 upstream manufacturing Methods 0.000 abstract description 4
- 238000004220 aggregation Methods 0.000 abstract 1
- 230000002776 aggregation Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 101100481876 Danio rerio pbk gene Proteins 0.000 description 1
- RPNUMPOLZDHAAY-UHFFFAOYSA-N Diethylenetriamine Chemical compound NCCNCCN RPNUMPOLZDHAAY-UHFFFAOYSA-N 0.000 description 1
- 101100481878 Mus musculus Pbk gene Proteins 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000008707 rearrangement Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/04—Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/067—Enterprise or organisation modelling
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Strategic Management (AREA)
- General Physics & Mathematics (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- General Engineering & Computer Science (AREA)
- Educational Administration (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a sequencing optimization method and a sequencing optimization system of a content personalized recommendation system, wherein the method comprises the following steps: acquiring a user click operation, recalling and generating a list of contents to be sorted for preliminary screening; (II) scoring the list of the contents to be sorted of the initial screening according to a sorting model to generate an initial content-sorting score association vector; and thirdly, performing secondary sorting on the initial content-sorting score association vector based on a self-adaptive strategy to obtain a final sorting result. The method solves the problem of viscosity accuracy between users and contents, and performs adaptive sampling and aggregation aiming at the push content list of various upstream recall strategies to generate the content push list with richer varieties and more accurate individuality, thereby realizing the diversity of the individuality and accurate recommendation content varieties. The invention improves the accuracy of the product recommendation system.
Description
Technical Field
The invention relates to a recommendation system sorting method and a recommendation system sorting system, in particular to a sorting optimization method and a sorting optimization system of a content personalized recommendation system.
Background
At present, in a content community platform product of the internet, a personalized accurate recommendation system is the technical core of the product. In order to improve the use experience of the product in the community for the old users, the content which can be positively fed back by the users needs to be pushed. The target is realized by the cooperation of key links such as recall, rough arrangement, fine arrangement, rearrangement and the like in the recommendation system.
In the content ordering link, a content list which is really interested in is pushed to each user on the premise that the user does not have explicit behavior. To achieve the effect of personalized and accurate pushing of users, three service requirements need to be met: firstly, the characteristic rule of the historical click behavior of the user needs to be considered; second, there is a need to avoid pushing results for banner party content or single categories; third, it is important to ensure diversity of recommended content, so that users feel a novel experience of "familiar and strange" with respect to content push.
Most of the current sorting methods have the following problems: 1. only user behavior data (such as reading duration, reading completion rate, praise appreciation and the like) are considered; 2. obtaining a relatively rough user-content association vector in a relatively single depth model mode; 3. the upstream recall strategy is too single to carry out strategy modification of content diversity in the sequencing link.
Disclosure of Invention
The purpose of the invention is as follows: the invention provides a recommendation system sequencing optimization method with high user-content viscosity accuracy. The invention also aims to provide a sequencing optimization system based on the sequencing optimization method.
The technical scheme is as follows: the sequencing optimization method of the content personalized recommendation system comprises the following steps:
acquiring a user click operation, recalling and generating a list of contents to be sorted for preliminary screening;
(II) scoring the list of the contents to be sorted of the initial screening according to a sorting model to generate an initial content-sorting score association vector;
and thirdly, performing secondary sorting on the initial content-sorting score association vector based on a self-adaptive strategy to obtain a final sorting result.
Further, in the step (a), the list of contents to be initially screened is a list of content ids related to the historical click data of the user.
Further, in the step (two), the ranking model includes a double tower model.
Preferably, the step (two) includes:
(21) extracting user characteristic information and content characteristic information according to the list of the contents to be sorted which are preliminarily screened;
(22) according to different sorting models, evaluating the metadata or respectively evaluating the metadata after the user characteristic information and the content characteristic information are combined, and selecting the sorting model with the highest score as an actual sorting model;
(23) in an off-line training stage of the recommendation system, the user characteristic information and the content characteristic information are respectively input into the actual sequencing model to obtain a user embedded vector and a content embedded vector with the same dimension;
(24) performing dot product calculation on the user embedded vector and the content embedded vector, performing cross entropy loss calculation on a dot product value and a sample label value clicked by the user, and performing backward propagation to optimize network parameters of an actual sequencing model;
(25) inputting the user characteristic information and the content characteristic information to be sorted into the optimized actual sorting model, and taking the dot product result of the model output vector as a sorting score to obtain an initial content-sorting score association vector.
Further, the user feature information includes: the content feature vector of the user click sequence, the content feature vector of the user portrait index and the content feature vector of the user favorite sequence.
Further, the content embedding vector is calculated by continuously calling a deep network at the content side of the actual sequencing model, an embedding layer is output, and the actual sequencing model is updated and stored for the on-line prediction of a new content sequence to be inquired and used.
Further, when the content embedding vector is predicted on line, calculation is performed by calling a deep network on the user side of the actual sequencing model.
Further, the step (three) includes:
(31) acquiring the initial content-sorting score association vector, counting all vector sources, and classifying each vector into a corresponding recall group;
(32) the adaptive sampling weight is calculated according to the following formula:
wherein,the sampling coefficient representing the recall packet i,nindicates the number of recall packets and the number of recall packets,indicating the click rate of the ith recall packet;
(33) generating a Top-K recommended content vector list according to the following formula, wherein the number of the actual recommended content vector lists of the ith recall group is as follows:
wherein,indicating the number of recalls configured for the ith recall group,the number of recalls after the weighted calculation of the ith recall group is represented, and the following conditions are met:
wherein m is the total number of recalled content ids;
(34) number of recalls grouped according to recallExecuting a sample balance processing strategy to average the content samples according to the deviation from the actual recalling number;
(35) and performing secondary sorting on the Top-K recommended content vector list by the fusion service logic to obtain a final sorting result.
Further, the sample balancing processing strategy is: when a certain number of recall packets is insufficientThe recommender system recalculates the sampling coefficients based on the number of recall missesAccording toPreferably, a corresponding amount of content is extracted from the recall packet having a large value to supplement the content missing from the other recall packets.
The sequencing optimization system of the content personalized recommendation system comprises the following components:
the rough screening module is used for acquiring the clicking operation of the user, recalling and generating a list of contents to be sorted of the primary screening;
the first sequencing module is used for scoring the list of the contents to be sequenced by the sequencing model to generate an initial content-sequencing score association vector;
and the second sequencing module is used for carrying out secondary sequencing on the initial content-sequencing score association vector output by the first sequencing module based on a self-adaptive strategy to obtain a final sequencing result.
Has the advantages that: the invention has the following advantages:
1. the viscosity between each content of the product list and the user is calculated for initialization sequencing, so that individual real-time recommendation for each user is accurately realized;
2. based on the diversity of the recall groups, the self-adaptive sampling is carried out, so that products recommended to a single user come from different categories, and the intelligent effect of the recommendation system is improved;
3. adding a praise sequence in the user behavior data to ensure the robustness of the user behavior vector characteristics;
4. the introduction of user portrait information increases the richness and representativeness of features.
Drawings
FIG. 1 is a flow chart of a ranking optimization method of the content personalized recommendation system of the present invention;
FIG. 2 is a first ranking module framework diagram of the ranking optimization system of the content personalized recommendation system of the present invention;
fig. 3 is a frame diagram of a second ranking module of the ranking optimization system of the content personalized recommendation system of the invention.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Referring to fig. 1, a flowchart of a ranking optimization method of a content personalized recommendation system according to the present invention is shown, where the method includes:
the method comprises the following steps of (I) acquiring a user click operation, recalling upstream and generating a list of contents to be sorted for preliminary screening, wherein the method specifically comprises the following steps:
and acquiring m pieces of content information data provided by a recall system each time a single behavior operation clicked by a user is captured, and combining a user id and a content id corresponding to the acquired content information data into a binary group to be recorded as (user id, content id).
And (II) scoring the list of the contents to be sorted of the initial screening according to a sorting model to generate an initial content-sorting score association vector.
Extracting user characteristic information and content characteristic information according to the list of the contents to be sorted which are preliminarily screened; wherein,
the user characteristic information comprises three types of data information: 1. preprocessing data of a feature vector of a user click sequence; 2. the basic attributes of the user comprise multi-dimensional user portrait indexes such as gender, age, consumption level and the like; 3. the user approves the preprocessed data of the feature vector of the sequence, and the processing mode is consistent with that of the class 1 data. The preprocessed data is an average or weighted average.
The content characteristic information comprises information such as primary classification, secondary classification and content author based on the content.
How to evaluate whether the pushed content is accurate or not, in the feature engineering of other current recommendation systems, the preference degree of the user for the content is generally comprehensively measured through data such as clicking, praise, commenting, forwarding, collecting, browsing and playing time of the recommended content by the user; or the satisfaction degree of the user to the APP is measured through the times of opening the APP by the user, the time interval of returning the APP by the user, the one-time stay time of the user and the like, and the satisfaction degree of the user to the recommended content can be reflected to some extent.
Therefore, in conjunction with the above industry experience, user portrait considerations are taken into account. The feature vector of the user click sequence contains the click sequence of the user's last 50 contents, the praise sequence of the last 50 contents, and fixed attributes of the user, such as gender, age, and consumption level. After content feature vectors of nearly 50 user click sequences are obtained, dimension reduction processing needs to be performed on the 50 x 32 dimensional matrix, and the content feature vectors are compressed into a 32 dimensional first group of user vectors after average calculation. And the recent 50 user praise sequences are subjected to dimensionality reduction by adopting the same method to generate a 32-dimensional second group of user vectors. The basic characteristics of the user comprise multi-dimensional user portrait indexes such as gender, age, consumption level and the like, and the discrete characteristics are processed in a single-hot coding mode to generate a third group of user vectors. And combining the three groups of user vectors to be used as the splicing vector input of the deep neural network at the user side. Similarly, the content feature information comprises content-based first-level classification information and second-level classification information, and after the content feature information is combined into vectors, the vectors are averaged with the user vectors to generate one-dimensional spliced vectors which are sent to the deep network of the content side.
As shown in FIG. 2, model training of the recommendation system is performed in an off-line stage, and the user combination features and the content combination features are respectively fed into the selected deep neural network model. The method disclosed by the invention tests classical sequencing models such as a double-tower model, a Google Wide and Deep model (Wide & Deep), a DIN model in Ali, a DIEN model and the like. And selecting a model with the highest accuracy rate and F1, namely a double-tower model as a base line of the sequencing model according to the accuracy rate of the data verification set, indexes such as F1 and the like. And calculating the user and the content to the combined feature vector, and calculating to obtain two units of a user embedded vector and a content embedded vector, namely respectively serving as low-dimensional semantic representations of the user and the content.
The two methods calculate cross entropy loss through the label value of the dot product result sample, and carry out backward propagation to optimize network parameters. In addition, the content embedding vector calls a deep tower network on the content side of the model to calculate, and the model is stored in an online environment for the sequential query of new content characteristic information predicted online.
Meanwhile, in the online prediction stage, the combined feature vector obtained by combining the user feature information and the content feature information of the new user also needs to be calculated by calling a model user side deep network, after the user embedded vector is generated, click operation is carried out on the combined feature vector and the content embedded vector of each content stored in the model, and finally the logit is taken as the score of the content-ranking score association vector, and the output format of the process is (user id, content id, ranking score).
And (III) as shown in FIG. 3, the module obtains the (user id, content id, ranking score) vector set predicted on line by the ranking double-tower model. And counting the content sources in all the triples, namely determining which recall strategy the content is pushed by, classifying each triplet according to the counting result, and marking the corresponding identification of the corresponding recall strategy group.
Here, there are a total of five recall groups, as shown in table 1 below.
TABLE 1
Recall group name | Principle of grouping |
i2i | Similar computing mode recalls between contents |
u2i | Preference calculation mode recall between user and content |
up | User portrait calculation recall |
hot | Computing-based recall of hot content |
u2u | Similarity calculation mode recalling between users |
(IV) according to the click rate index of the latest 30 days of each group, carrying out self-adaptive sampling weight calculation to obtain a sampling coefficient corresponding to each recall group:
Wherein,a sampling coefficient representing the recall packet i, n represents the number of recall packets,indicating the click rate of the ith recall packet.
The sampling weight for each recall packet will be calculatedAnd total number of content items pushed by the recall packetSubstituting the following formula to generate a Top-K recommended content vector list:
wherein,indicating the number of recalls configured for the ith recall group,shows the number of recalls weighted by the ith recall group (according to the table)5), the following conditions are satisfied:
where m is the total number of recalled content ids.
Although topk content sequences pushed to the downstream can be extracted accurately according to the above strategy in an ideal situation, in reality, the situation that the number of recalls in a content list pushed by an upstream recall system is uneven is likely to exist, for example, the number of recalls in a certain recall group is not enough. Therefore, recall balance evaluation is needed, and a corresponding processing strategy is carried out. When the number of certain recall groups is not enough, the recommendation system can be used for recalling the missing numberCalculating a sampling coefficientAccording toPreferably, a corresponding amount of content is extracted from the recall packet having a large value to supplement the content missing from the other recall packets. And finally, the business logic of the product is fused, and the Top-K recommended content list is sent to a downstream link after secondary sequencing.
The sequencing optimization system of the content personalized recommendation system comprises the following components:
the rough screening module is used for acquiring the clicking operation of the user, recalling and generating a list of contents to be sorted of the primary screening;
the first sequencing module is used for scoring the list of the contents to be sequenced by the sequencing model to generate an initial content-sequencing score association vector;
and the second sequencing module is used for carrying out secondary sequencing on the initial content-sequencing score association vector output by the first sequencing module based on a self-adaptive strategy to obtain a final sequencing result.
And the list of the contents to be sorted in the preliminary screening is a content id list related to the historical click data of the user.
The first sequencing module further comprises:
the preprocessing subunit is used for extracting user characteristic information and content characteristic information according to the preliminary screening content list to be sorted; according to different sorting models, evaluating the metadata or respectively evaluating the metadata after the user characteristic information and the content characteristic information are combined, and selecting the sorting model with the highest score as an actual sorting model;
the calculation subunit is used for respectively inputting the user characteristic information and the content characteristic information into the actual sequencing model in an offline training stage of the recommendation system to obtain a user embedded vector and a content embedded vector with the same dimension; performing dot product calculation on the user embedded vector and the content embedded vector, performing cross entropy loss calculation on a dot product value and a sample label value clicked by the user, and performing backward propagation to optimize network parameters of an actual sequencing model; inputting the user characteristic information and the content characteristic information to be sorted into the optimized actual sorting model, and taking the dot product result of the model output vector as a sorting score to obtain an initial content-sorting score association vector.
Further, the user feature information includes: the content feature vector of the user click sequence, the user portrait index and the content feature vector of the user approval sequence.
Further, the content embedding vector is calculated by continuously calling a deep network at the content side of the actual sequencing model, an embedding layer is output, and the actual sequencing model is updated and stored for the on-line prediction of a new content sequence to be inquired and used.
Further, when the content embedding vector is predicted on line, calculation is performed by calling a deep network on the user side of the actual sequencing model.
The second sorting module is specifically configured to obtain the initial content-sorting score association vectors, count all vector sources, and classify each vector into a corresponding recall group; the adaptive sampling weight is calculated according to the following formula:
wherein,the sampling coefficient representing the recall packet i,nindicates the number of recall packets and the number of recall packets,representing the click rate representing the ith recall packet;
generating a Top-K recommended content vector list according to the following formula, wherein the number of the actual recommended content vector lists of the ith recall group is as follows:
wherein,indicating the number of recalls configured for the ith recall group,the number of recalls after the weighted calculation of the ith recall group is represented, and the following conditions are met:
wherein m is the total number of recalled content ids;
number of recalls grouped according to recallExecuting a sample balance processing strategy to average the content samples according to the deviation from the actual recalling number; and performing secondary sorting on the Top-K recommended content vector list by the fusion service logic to obtain a final sorting result.
Further, the sample balancing processing strategy is: when a certain number of recall packets is insufficientThe recommender system recalculates the sampling coefficients based on the number of recall missesAccording toPreferably, a corresponding amount of content is extracted from the recall packet having a large value to supplement the content missing from the other recall packets.
Claims (10)
1. A sequencing optimization method of a content personalized recommendation system is characterized by comprising the following steps:
acquiring a user click operation, recalling and generating a list of contents to be sorted for preliminary screening;
(II) scoring the list of the contents to be sorted of the initial screening according to a sorting model to generate an initial content-sorting score association vector;
and thirdly, performing secondary sorting on the initial content-sorting score association vector based on a self-adaptive strategy to obtain a final sorting result.
2. The ranking optimization method of the content personalized recommendation system according to claim 1, wherein in step (one), the list of the content to be initially screened is a list of content ids related to historical click data of the user.
3. The ranking optimization method of the content personalized recommendation system according to claim 1, wherein in the step (two), the ranking model comprises a double tower model.
4. The ranking optimization method of the content personalized recommendation system according to claim 1, wherein the step (two) includes:
(21) extracting user characteristic information and content characteristic information according to the list of the contents to be sorted which are preliminarily screened;
(22) according to different sorting models, evaluating the metadata or respectively evaluating the metadata after the user characteristic information and the content characteristic information are combined, and selecting the sorting model with the highest score as an actual sorting model;
(23) in an off-line training stage of the recommendation system, the user characteristic information and the content characteristic information are respectively input into the actual sequencing model to obtain a user embedded vector and a content embedded vector with the same dimension;
(24) performing dot product calculation on the user embedded vector and the content embedded vector, performing cross entropy loss calculation on a dot product value and a sample label value clicked by the user, and performing backward propagation to optimize network parameters of an actual sequencing model;
(25) inputting the user characteristic information and the content characteristic information to be sorted into the optimized actual sorting model, and taking the dot product result of the model output vector as a sorting score to obtain an initial content-sorting score association vector.
5. The ranking optimization method of the content personalized recommendation system according to claim 4, wherein in the step (21), the user feature information includes a feature vector of a user click sequence, a feature vector of a user portrait indicator, and a feature vector of a user favorite sequence.
6. The ranking optimization method of the content personalized recommendation system according to claim 4, wherein the content embedding vector is calculated by continuously calling a deep network on a content side of the actual ranking model, and the actual ranking model is updated and saved for use in online predicted new content sequence query.
7. The ranking optimization method of the content personalized recommendation system according to claim 4, wherein the content embedding vector is calculated by calling a deep network of a user side of the actual ranking model when predicted on line.
8. The ranking optimization method of the content personalized recommendation system according to claim 1, wherein the step (three) includes:
(31) acquiring the initial content-sorting score association vector, counting all vector sources, and classifying each vector into a corresponding recall group;
(32) the adaptive sampling weight is calculated according to the following formula:
wherein,the sampling coefficient representing the recall packet i,nindicates the number of recall packets and the number of recall packets,indicating the click rate of the ith recall packet;
(33) generating a Top-K recommended content vector list according to the following formula, wherein the number of the actual recommended content vector lists of the ith recall group is as follows:
wherein,indicating the number of recalls configured for the ith recall group,the number of recalls after the weighted calculation of the ith recall group is represented, and the following conditions are met:
wherein m is the total number of recalled content ids;
(34) number of recalls grouped according to recallExecuting a sample balance processing strategy to average the content samples according to the deviation from the actual recalling number;
(35) and performing secondary sorting on the Top-K recommended content vector list by the fusion service logic to obtain a final sorting result.
9. The ranking optimization method of the content personalized recommendation system according to claim 8, wherein the sample balancing processing policy is: when a certain number of recall packets is insufficientThe recommender system recalculates the sampling coefficients based on the number of recall missesAccording toPreferably, a corresponding amount of content is extracted from the recall packet having a large value to supplement the content missing from the other recall packets.
10. A ranking optimization system for a content personalized recommendation system, the system comprising:
the rough screening module is used for acquiring the clicking operation of the user, recalling and generating a list of contents to be sorted of the primary screening;
the first sequencing module is used for scoring the list of the contents to be sequenced by the sequencing model to generate an initial content-sequencing score association vector;
and the second sequencing module is used for carrying out secondary sequencing on the initial content-sequencing score association vector output by the first sequencing module based on a self-adaptive strategy to obtain a final sequencing result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110338178.7A CN112801760A (en) | 2021-03-30 | 2021-03-30 | Sequencing optimization method and system of content personalized recommendation system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110338178.7A CN112801760A (en) | 2021-03-30 | 2021-03-30 | Sequencing optimization method and system of content personalized recommendation system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112801760A true CN112801760A (en) | 2021-05-14 |
Family
ID=75815855
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110338178.7A Withdrawn CN112801760A (en) | 2021-03-30 | 2021-03-30 | Sequencing optimization method and system of content personalized recommendation system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112801760A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722537A (en) * | 2021-08-11 | 2021-11-30 | 北京奇艺世纪科技有限公司 | Short video sequencing and model training method and device, electronic equipment and storage medium |
CN113868466A (en) * | 2021-12-06 | 2021-12-31 | 北京搜狐新媒体信息技术有限公司 | Video recommendation method, device, equipment and storage medium |
CN113889209A (en) * | 2021-09-26 | 2022-01-04 | 浙江禾连网络科技有限公司 | Recommendation system and storage medium for health management service products |
CN114139046A (en) * | 2021-10-29 | 2022-03-04 | 北京达佳互联信息技术有限公司 | Object recommendation method and device, electronic equipment and storage medium |
CN114547417A (en) * | 2022-02-25 | 2022-05-27 | 北京百度网讯科技有限公司 | Media resource ordering method and electronic equipment |
CN114997532A (en) * | 2022-07-29 | 2022-09-02 | 江苏新视云科技股份有限公司 | Civil telephone delivery scheduling method under uncertain environment, terminal and storage medium |
CN117290608A (en) * | 2023-11-23 | 2023-12-26 | 深圳数拓科技有限公司 | Marketing scheme intelligent pushing method, system and storage medium |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110825974A (en) * | 2019-11-22 | 2020-02-21 | 厦门美柚股份有限公司 | Recommendation system content ordering method and device |
WO2020044098A2 (en) * | 2018-08-30 | 2020-03-05 | 优视科技新加坡有限公司 | Method and apparatus for sorting in information stream, and device/terminal/server |
-
2021
- 2021-03-30 CN CN202110338178.7A patent/CN112801760A/en not_active Withdrawn
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020044098A2 (en) * | 2018-08-30 | 2020-03-05 | 优视科技新加坡有限公司 | Method and apparatus for sorting in information stream, and device/terminal/server |
CN110825974A (en) * | 2019-11-22 | 2020-02-21 | 厦门美柚股份有限公司 | Recommendation system content ordering method and device |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113722537A (en) * | 2021-08-11 | 2021-11-30 | 北京奇艺世纪科技有限公司 | Short video sequencing and model training method and device, electronic equipment and storage medium |
CN113722537B (en) * | 2021-08-11 | 2024-04-26 | 北京奇艺世纪科技有限公司 | Short video ordering and model training method and device, electronic equipment and storage medium |
CN113889209A (en) * | 2021-09-26 | 2022-01-04 | 浙江禾连网络科技有限公司 | Recommendation system and storage medium for health management service products |
CN114139046A (en) * | 2021-10-29 | 2022-03-04 | 北京达佳互联信息技术有限公司 | Object recommendation method and device, electronic equipment and storage medium |
CN113868466A (en) * | 2021-12-06 | 2021-12-31 | 北京搜狐新媒体信息技术有限公司 | Video recommendation method, device, equipment and storage medium |
CN114547417A (en) * | 2022-02-25 | 2022-05-27 | 北京百度网讯科技有限公司 | Media resource ordering method and electronic equipment |
CN114997532A (en) * | 2022-07-29 | 2022-09-02 | 江苏新视云科技股份有限公司 | Civil telephone delivery scheduling method under uncertain environment, terminal and storage medium |
CN114997532B (en) * | 2022-07-29 | 2023-02-03 | 江苏新视云科技股份有限公司 | Civil telephone delivery scheduling method under uncertain environment, terminal and storage medium |
CN117290608A (en) * | 2023-11-23 | 2023-12-26 | 深圳数拓科技有限公司 | Marketing scheme intelligent pushing method, system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112801760A (en) | Sequencing optimization method and system of content personalized recommendation system | |
CN110162693B (en) | Information recommendation method and server | |
CN104281622B (en) | Information recommendation method and device in a kind of social media | |
TWI591556B (en) | Search engine results sorting method and system | |
CN111523055B (en) | Collaborative recommendation method and system based on agricultural product characteristic attribute comment tendency | |
CN110619540A (en) | Click stream estimation method of neural network | |
CN117522479B (en) | Accurate Internet advertisement delivery method and system | |
CN112749330A (en) | Information pushing method and device, computer equipment and storage medium | |
CN117874347A (en) | Content recommendation technology based on business characteristics | |
CN113326432A (en) | Model optimization method based on decision tree and recommendation method | |
CN112148994A (en) | Information push effect evaluation method and device, electronic equipment and storage medium | |
CN115829683A (en) | Power integration commodity recommendation method and system based on inverse reward learning optimization | |
CN115438787A (en) | Training method and device of behavior prediction system | |
CN117541322B (en) | Advertisement content intelligent generation method and system based on big data analysis | |
CN104572915A (en) | User event relevance calculation method based on content environment enhancement | |
CN112651790B (en) | OCPX self-adaptive learning method and system based on user touch in quick-elimination industry | |
CN111339428B (en) | Interactive personalized search method based on limited Boltzmann machine drive | |
CN117726412A (en) | AI recommendation system and method based on big data | |
CN112541010A (en) | User gender prediction method based on logistic regression | |
CN116362810A (en) | Advertisement putting effect evaluation method | |
CN114399352B (en) | Information recommendation method and device, electronic equipment and storage medium | |
CN114358813B (en) | Improved advertisement putting method and system based on field matrix factorization machine | |
CN112215629A (en) | Multi-target advertisement generation system and method based on construction countermeasure sample | |
CN114912031A (en) | Mixed recommendation method and system based on clustering and collaborative filtering | |
CN110147497B (en) | Individual content recommendation method for teenager group |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20210514 |