WO2022126873A1 - Intelligent financial information recommendation system - Google Patents

Intelligent financial information recommendation system Download PDF

Info

Publication number
WO2022126873A1
WO2022126873A1 PCT/CN2021/080155 CN2021080155W WO2022126873A1 WO 2022126873 A1 WO2022126873 A1 WO 2022126873A1 CN 2021080155 W CN2021080155 W CN 2021080155W WO 2022126873 A1 WO2022126873 A1 WO 2022126873A1
Authority
WO
WIPO (PCT)
Prior art keywords
news
pool
user
probability
feature vector
Prior art date
Application number
PCT/CN2021/080155
Other languages
French (fr)
Chinese (zh)
Inventor
尹扬
郭鹏华
朱峰
Original Assignee
上海朝阳永续信息技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海朝阳永续信息技术股份有限公司 filed Critical 上海朝阳永续信息技术股份有限公司
Publication of WO2022126873A1 publication Critical patent/WO2022126873A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Definitions

  • the invention relates to the technical field of information recommendation, in particular to an intelligent recommendation system for financial information.
  • the existing recommendation algorithms mainly include: recommendation algorithm based on content relevance, recommendation algorithm based on collaborative filtering, recommendation algorithm based on popularity, and recommendation algorithm based on model.
  • Recommendation algorithm based on content relevance By analyzing user behavior and the text content of news, users and news are marked with keywords (subject headings) that can represent the characteristics of users and news, and then these keywords are calculated by tf-idf and other words.
  • the weighting algorithm constitutes the feature vectors of users and news respectively, and these feature vectors represent the characteristics of each user and each news. Then, the similarity between the user vector and the news vector is calculated by methods such as cosine similarity, and then the news that is more similar to the user is recommended to the user according to these similarities.
  • One of the biggest drawbacks of the recommendation algorithm based on content relevancy is that it is easy to generate homogeneous recommendation, that is, it always recommends news of the same type of content to users and enters an infinite loop, losing the diversity and novelty of the recommended content.
  • Recommendation algorithm based on collaborative filtering analyze each user's evaluation of the item (through browsing records and purchase records, etc.); calculate the similarity between all users based on the user's evaluation of the item; select the N most similar to the current user. users; recommend the items that the N users have the highest evaluation and the current user has not browsed to the current user.
  • the recommendation algorithm based on collaborative filtering also has many shortcomings: 1. The cold start problem, that is, for new users and new items, the recommendation cannot be carried out; 2. The accuracy of the algorithm depends on a large amount of accurate user data, if the user data is small. 3. In some systems with short item life cycles (such as news and advertisements), due to the fast update speed, a large number of items will not have user ratings, resulting in a sparse rating matrix, which is not conducive to Recommendations for these content.
  • Popularity-based recommendation algorithm Sort by popularity according to data such as clicks, page views, unique visitors, and sharing rates, and recommend them to users.
  • the advantage of this algorithm is that it is simple and suitable for new users who have just registered.
  • the disadvantage is that it cannot provide personalized recommendations for users.
  • Model-based recommendation algorithm build a model through methods such as machine learning, and then use a large amount of existing user behavior data, purchase records and various user characteristics to train and fit the built model, and then apply the training
  • a good model inputs the feature attributes of the user to be recommended, and the model outputs the final recommendation result.
  • the disadvantage of this algorithm is that it requires a large amount of user historical behavior data, and requires repeated manual intervention for attribute combination and screening (ie, feature engineering).
  • attribute combination and screening ie, feature engineering
  • the purpose of the present invention is to provide an intelligent financial information recommendation system to solve the problem of single information recommendation function in the prior art, unable to recommend information with diversity, novelty, high accuracy, personalization and strong timeliness.
  • the present invention provides a financial information intelligent recommendation system, including:
  • a news feature vector calculation module configured to calculate the feature vector of each news
  • a user feature vector calculation module configured to calculate the feature vector of each user
  • the multi-dimensional news pool creation module is configured to create news pools of multiple dimensions and sort the news in each news pool;
  • the news recommendation module is configured to calculate the sampling probability of each news pool, sample each news pool according to the sampling probability, and recommend the news ranked first in the sampled news pool to the user.
  • the method of calculating the feature vector of each news is:
  • the feature vector V is an N-dimensional vector
  • N is the total number of all the topic words in the database
  • one bit of the feature vector V corresponds to a topic word, wherein, the bit of the vector corresponding to any topic word in each news
  • the value is equal to the product of the weight of the keyword and the dynamic inverse document frequency within a period of the keyword.
  • a cycle consists of 20, 30 or 40 days
  • the inverse document frequency is Inverse Document Frequency, which is idf.
  • the calculation method of each user's feature vector is:
  • V s Normalize(V p )-B ⁇ Normalize(V d ) ⁇ (
  • V s is the feature vector of each user
  • V p is the feature vector of the user's historical reading news
  • V d is the feature vector of the news that the user has clicked and disliked
  • V t is the feature vector of the keyword that the user has clicked and disliked
  • 2 is the 2-norm of the eigenvector V d
  • Normalize(V p ) and Normalize(V d ) are the normalized vectors of the eigenvectors V p and V d respectively
  • B and E are the calculation parameter
  • is the penalty function of the number of news.
  • V p , V d and V t are all N-dimensional feature vectors, N is the total number of all subject words in the database, and one bit of the feature vector corresponds to a subject word;
  • Any one of the Vp feature vectors is equal to the weight of the corresponding topic word in the user's historical reading news multiplied by the dynamic inverse document frequency of the topic word within a period;
  • V d feature vectors is equal to the weight of the corresponding topic word in the news that the user has clicked and disliked multiplied by the dynamic inverse document frequency of the topic word within a period;
  • V t feature vectors is equal to the weight of the keyword that the user has clicked and disliked multiplied by the dynamic inverse document frequency within a period of the keyword.
  • the dimension types in the multi-dimensional news pool creation module include but are not limited to: macro, pre-market, afternoon comment, capital inflow and outflow, investment hotspot, hot topic news, click list, Self-selected stocks, international current affairs and finance, fund channels, Hong Kong stocks and external markets.
  • Hot topic news pool first calculate the hot probability of each hot topic in the hot topic news pool, sample each hot topic in the hot topic news pool according to the hot probability, and sort the latest news corresponding to the sampled hot topics in order;
  • the news in the click list news pool is sorted according to the user's click volume
  • the news in the news pool with strong timeliness is sorted in reverse order according to the release time of the news, that is, the news with the newer release time is ranked in the front, and the news pool with strong timeliness includes pre-market, afternoon comments, and capital inflows and outflows;
  • the news in the rest of the news pools are sorted according to the similarity of the feature vector of the news and the feature vector of the user, and sorted according to the degree of similarity.
  • the calculation method of the hot probability of each hot topic in the hot topic news pool is:
  • the calculation method of the hotspot probability of each hot topic is:
  • the calculation method of normalizing the hotspot probability of hot topics is as follows:
  • K is the number of hot topics to be sampled
  • j is the hot topic ranked jth in the hot value
  • power is the calculation parameter
  • q(j) is the normalized sampling probability of hot topic j
  • h j is the hotness value of hot topic j
  • h K is the K-th
  • the hotness value of the hot topic, that is, h K is the hotness value of the hot topic corresponding to the lowest hot topic in the hot topic.
  • the method of calculating the sampling probability of each news pool is:
  • the news that ranks first in the sampled news pool is recommended to the user.
  • the similarity probability is calculated as:
  • i represents any news pool
  • Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool
  • sim i is the financial information user to be recommended and the news ranked first in the news pool.
  • Primary i is the preset initial probability of the news pool
  • C and ⁇ are the calculation parameters;
  • the additional probability is calculated as:
  • i represents any news pool
  • Padditional i is the normalized additional probability of the news pool
  • m is the total number of current news pools that can be sampled
  • the sampling probability is calculated as:
  • i any news pool
  • P i is the sampling probability of the news pool
  • Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool
  • Padditional i is the news pool return The normalized additional probability.
  • the statistical method of probability can dynamically sample news from multiple dimensions (multiple news pools).
  • the sampling probability can be dynamically determined by factors such as user similarity (ie user preference), news popularity and business logic. While pushing news that conforms to user preferences and interests to users, it can also push accurate, timely and useful investment information and investment opportunities to users.
  • FIG. 1 is a block diagram of a financial information intelligent recommendation system provided by an embodiment of the present invention.
  • FIG. 2 is a recommendation flowchart in the financial information intelligent recommendation system provided by an embodiment of the present invention
  • Figure 1 is a block diagram of a financial information intelligent recommendation system provided by an embodiment of the present invention
  • Figure 2 is a financial information intelligent recommendation system provided by an embodiment of the present invention.
  • the recommendation flow chart in the recommendation system, the financial information intelligent recommendation system includes:
  • a news feature vector calculation module configured to calculate the feature vector of each news
  • a user feature vector calculation module configured to calculate the feature vector of each user
  • the multi-dimensional news pool creation module is configured to create news pools of multiple dimensions and sort the news in each news pool;
  • the news recommendation module is configured to calculate the sampling probability of each news pool, sample each news pool according to the sampling probability, and recommend the news ranked first in the sampled news pool to the user.
  • the present invention solves the problem of single information recommendation function in the prior art, unable to recommend information with diversity, novelty, high accuracy, personalization and strong timeliness. It can also provide users with a variety of content-rich and timely information to capture the ever-changing investment opportunities in a timely manner.
  • a period includes 20 days, 30 days or 40 days, preferably a period is 30 days,
  • the inverse document frequency is Inverse Document Frequency, ie idf;
  • the feature vector V of each news is calculated, the feature vector V is an N-dimensional vector, N is the total number of all subject words in the database, and one bit of the feature vector V corresponds to a subject word, wherein, the vector corresponding to any subject word in each news
  • the calculation method of the feature vector of each user is:
  • V s Normalize(V p )-B ⁇ Normalize(V d ) ⁇ (
  • V s is the feature vector of each user
  • V p is the feature vector of the user's historical reading news
  • V d is the feature vector of the news that the user has clicked and disliked
  • V t is the feature vector of the keyword that the user has clicked and disliked
  • 2 is the 2-norm of the feature vector V d
  • Normalize(V p ) and Normalize(V d ) are the normalized vectors of the feature vectors V p and V d respectively
  • B and E are the calculation parameter
  • is the penalty function of the number of news.
  • v d1 , v d2 ,..., v dr are r components of V d ;
  • v p1 , v p2 ,..., v pr are the r components of V p
  • v d1 , v d2 ,..., v dr are the r components of V d
  • 2 is the feature vector V p
  • 2 is the 2-norm of the feature vector V d .
  • ⁇ and ⁇ are the calculation parameters.
  • the reason for designing the penalty function is: when the user first uses the function of disliking news, he only clicks a few news. At this time, due to the small vector norm, the normalization After transformation, each component is too large, resulting in too strong shielding at the beginning, so it is necessary to multiply the vector by a small penalty function to make each component smaller. With the increase in the use of the dislike news function, the norm of the dislike news vector increases, so that ⁇ (
  • the user's historical reading news, the news that the user has clicked and disliked, and the subject words that the user has clicked and disliked will decay before updating the user's feature vector, or decay over time, so that new users can
  • the behavior is given a greater weight in order to adapt to the changes in the user's reading interest, and the news corresponding to the news that has been pushed, the news that the user has clicked and disliked, and the subject words that the user has clicked dislike can also be removed from the recommendation process. .
  • V p , V d and V t are all N-dimensional feature vectors, N is the total number of all subject words in the database, and one bit of the feature vector corresponds to a subject word;
  • Any one of the Vp feature vectors is equal to the weight of the corresponding topic word in the user's historical reading news multiplied by the dynamic inverse document frequency of the topic word within a period;
  • V d feature vectors is equal to the weight of the corresponding topic word in the news that the user has clicked and disliked multiplied by the dynamic inverse document frequency of the topic word within a period;
  • V t feature vectors is equal to the weight of the keyword that the user has clicked and disliked multiplied by the dynamic inverse document frequency within a period of the keyword.
  • the present invention automatically divides news into multiple categories through a classification algorithm or classifier. Therefore, the dimension types in the dimension news pool creation module include: But not limited to: macro, pre-market, afternoon review, capital inflows and outflows, investment hotspots, hot topic news, click list, self-selected stocks, international current affairs and finance, fund channels, Hong Kong stocks and external markets.
  • the hot probability of each hot topic in the hot topic news pool is first calculated, and each hot topic in the hot topic news pool is sampled according to the hot probability, and the latest hot topic corresponding to the sampled hot topic is sampled.
  • the news is sorted in order.
  • the news in the click list news pool is sorted according to the user's click volume, and the higher the click volume
  • the news indicates that the news is more popular or important, and the ranking is higher; the news in the news pool with strong timeliness is sorted in reverse order according to the release time of the news, that is, the news with the newer release time is ranked first, so as to ensure that Major financial news can be pushed to users as soon as possible.
  • News pools with strong timeliness include pre-market, afternoon reviews, and capital inflows and outflows; the news in other news pools are sorted according to the similarity between the feature vector of the news and the feature vector of the user, and the similarity is based on the similarity. The higher the similarity is, the better the news is in line with the user's reading or investment preferences, and the higher the ranking is, so that the news recommendation module can give priority to pushing these news to the user as much as possible.
  • the cosine similarity between each news and each user is calculated through the news feature vector and the user feature vector.
  • the calculation method of the hot probability of each hot topic in the hot topic news pool is:
  • the calculation method of the hotspot probability of each hot topic is:
  • the calculation method of normalizing the hotspot probability of hot topics is as follows:
  • K is the number of hot topics to be sampled
  • j is the hot topic ranked jth in the hot value
  • power is the calculation parameter
  • q(j) is the normalized sampling probability of hot topic j
  • h j is the hotness value of hot topic j
  • h K is the K-th
  • the hotness value of the hot topic, that is, h K is the hotness value of the hot topic corresponding to the lowest hot topic in the hot topic.
  • the news recommendation module is equivalent to the brain center of the entire financial information intelligent recommendation system, and the news recommendation module adopts probability sampling technology combined with time to determine when and to which users what news is finally recommended.
  • the sampling probability of each news pool is:
  • the news that ranks first in the sampled news pool is recommended to the user.
  • the similarity probability is calculated as:
  • i represents any news pool
  • Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool
  • sim i is the financial information user to be recommended and the news ranked first in the news pool.
  • the similarity of bits of news Pinitial i is the preset initial probability of the news pool, the specific value of Pintial i is determined according to the business scenario and the importance of each news pool
  • C and ⁇ are calculation parameters; the sampling probability of each news pool will be It increases rapidly with the increase of the similarity probability of the first-ranked news, that is, if the similarity probability of the first-ranked news of a certain news pool is significantly larger than that of the first-ranked news of other news pools. If the probability is similar, the news in the news pool will be preferentially sampled, because a large similarity indicates that the news is more in line with the user's reading and investment preferences.
  • the news in the hot topic news pool and the click list news pool may not be very similar to the user's feature vector, in order to let the user know the current hot spots in time or let the user discover new points of interest, it is also necessary to send Users actively push this type of news, that is, the present invention defines an additional probability to increase the recommendation rate of this type of news.
  • the additional probability is calculated as:
  • i represents any news pool
  • Padditional i is the normalized additional probability of the news pool
  • m is the total number of current news pools that can be sampled; the additional probability is the normalized probability, so this additional probability will not be affected by The influence of the sampling probability of other news pools ensures that such news can be actively pushed to users under any circumstances, and its push strength mainly depends on this additional probability.
  • the sampling probability is calculated as:
  • i any news pool
  • P i is the sampling probability of the news pool
  • Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool
  • Padditional i is the news pool return After normalizing the additional probability, after the sampling probability of each news pool is obtained, each news pool is sampled according to the sampling probability, and the news ranked first in the sampled news pool is recommended to the user.
  • the news recommendation module recommends news to users according to the following logic and steps: For news pools with strong timeliness (such as pre-market, afternoon comment, and next week's outlook, etc.), set a specific push time, such as the pre-market push time is From 0:00 on the stock market trading day to before the opening, the afternoon is rated as 11:30-13:00 on the stock market trading day. During these time periods, the news will be pushed to users first, and these news will not be pushed to users in other time periods; then, according to the above calculation According to the sampling probability of each news pool, the news pool is sampled. If a news pool is sampled, the first news in the news pool will be pushed to the user first.
  • strong timeliness such as pre-market, afternoon comment, and next week's outlook, etc.
  • FIG. 3 is a display diagram of financial information recommendation provided by an embodiment of the present invention.
  • the recommendation system provided by the present invention pushes news information that meets the user's interest preference (house price and real estate), It also pushes various content-rich and timely news information (such as current hot events, the latest important macroeconomic data and unexpected financial events, etc.) to users.
  • the financial information intelligent recommendation system provided by the present invention, by establishing news pools of various dimensions that meet the characteristics of the financial field and investment needs, it is possible to push multi-dimensional, multi-level and diverse news to users;
  • the invention dynamically samples news from multiple dimensions (multiple news pools) through the statistical method of sampling probability, and the sampling probability can be dynamically determined by factors such as user similarity (ie user preference), news popularity and business logic. , so that it can push accurate, timely and useful investment information and investment opportunities to users while pushing news that conforms to users' preferences and interests.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to an intelligent financial information recommendation system, comprising a news feature vector calculation module configured to calculate a feature vector of each piece of news; a user feature vector calculation module configured to calculate a feature vector of each user; a multi-dimensional news pool creation module configured to create news pools in a plurality of dimensions, and to sort news in the news pools; and a news recommendation module configured to calculate a sampling probability of each news pool, to sample each news pool according to the sampling probability, and to recommend, to the users, the top news in the news pool obtained by means of sampling. According to the present invention, the problems in the prior art that the information recommendation function is single and that information that has diversity, novelty, a high accuracy rate, personalization, and high time efficiency at the same time cannot be recommended are solved. According to the present invention, news information that conforms to the investment preference of a user can be actively pushed to the user, all kinds of content-rich timely information can also be provided to the user and rapidly changing opportunities of investment can also be captured in a timely manner.

Description

一种金融资讯智能推荐系统An intelligent recommendation system for financial information 技术领域technical field
本发明涉及资讯推荐技术领域,特别涉及一种金融资讯智能推荐系统。The invention relates to the technical field of information recommendation, in particular to an intelligent recommendation system for financial information.
背景技术Background technique
随着经济社会和互联网等相关技术的发展,金融领域的各种新闻资讯和信息成爆炸式的增长。广大投资者面对每天都在产生的海量资讯信息,很难找到自己需要的有用信息并根据这些信息快速地做出投资决策。为了让投资者能快速获得对自己有用的金融信息,通过新闻推荐系统主动向投资者推荐符合投资者偏好特征的新闻资讯是一个良好的途径。但由于金融资讯具有很强的时效性(如突发事件和突发投资热点等)和多维度性(如宏观和微观等),传统的基于相关内容的推荐很难向投资者提供及时有用的投资信息和投资机会。With the development of economic society and related technologies such as the Internet, various news and information in the financial field have exploded. Faced with the massive amount of information generated every day, it is difficult for investors to find the useful information they need and make investment decisions quickly based on this information. In order to allow investors to quickly obtain financial information that is useful to them, it is a good way to actively recommend news information that meets investors' preferences through the news recommendation system. However, due to the strong timeliness of financial information (such as emergencies and sudden investment hotspots) and multi-dimensionality (such as macro and micro), it is difficult for traditional recommendations based on relevant content to provide timely and useful information to investors. Investment information and investment opportunities.
已有的推荐算法主要有:基于内容相关度的推荐算法、基于协同过滤的推荐算法、基于流行度的推荐算法以及基于模型的推荐算法等。The existing recommendation algorithms mainly include: recommendation algorithm based on content relevance, recommendation algorithm based on collaborative filtering, recommendation algorithm based on popularity, and recommendation algorithm based on model.
基于内容相关度的推荐算法:通过分析用户行为以及新闻的文本内容,分别对用户和新闻标注能够代表用户和新闻特征的关键词(主题词),然后将这些关键词通过tf-idf等计算词权重的算法分别组成用户和新闻的特征向量,这些特征向量就代表了各个用户和各篇新闻的特征。然后再用余弦相似度等方法计算用户向量和新闻向量之间的相似度,再根据这些相似度向用户推荐与用户相似度较大的新闻。基于内容相关度的推荐算法有一个最大的弊端就是容易产生同质化推荐,即一直给用户推荐同一类内容的新闻并进入死循环,失去了推荐内容的多样性和新颖性。Recommendation algorithm based on content relevance: By analyzing user behavior and the text content of news, users and news are marked with keywords (subject headings) that can represent the characteristics of users and news, and then these keywords are calculated by tf-idf and other words. The weighting algorithm constitutes the feature vectors of users and news respectively, and these feature vectors represent the characteristics of each user and each news. Then, the similarity between the user vector and the news vector is calculated by methods such as cosine similarity, and then the news that is more similar to the user is recommended to the user according to these similarities. One of the biggest drawbacks of the recommendation algorithm based on content relevancy is that it is easy to generate homogeneous recommendation, that is, it always recommends news of the same type of content to users and enters an infinite loop, losing the diversity and novelty of the recommended content.
基于协同过滤的推荐算法:分析各个用户对item的评价(通过浏览记录和购买记录等);依据用户对item的评价计算得出所有用户之间的相似度;选出与当前用户最相似的N个用户;将这N个用户评价最高并且当前用户又没有浏 览过的item推荐给当前用户。基于协同过滤的推荐算法也有很多不足之处:1.冷启动问题,即对于新用户和新item,推荐无法进行;2.该算法的准确性依赖于大量的准确用户数据,如果用户数据较少则会导致推荐不准确甚至推荐无法进行;3.在一些item生存周期短(如新闻和广告等)的系统中,由于更新速度快,大量item不会有用户评分,造成评分矩阵稀疏,不利于这些内容的推荐。Recommendation algorithm based on collaborative filtering: analyze each user's evaluation of the item (through browsing records and purchase records, etc.); calculate the similarity between all users based on the user's evaluation of the item; select the N most similar to the current user. users; recommend the items that the N users have the highest evaluation and the current user has not browsed to the current user. The recommendation algorithm based on collaborative filtering also has many shortcomings: 1. The cold start problem, that is, for new users and new items, the recommendation cannot be carried out; 2. The accuracy of the algorithm depends on a large amount of accurate user data, if the user data is small. 3. In some systems with short item life cycles (such as news and advertisements), due to the fast update speed, a large number of items will not have user ratings, resulting in a sparse rating matrix, which is not conducive to Recommendations for these content.
基于流行度的推荐算法:根据点击量、页面访问量、独立访客量以及分享率等数据按某种热度排序,并推荐给用户。这种算法的优点是简单,适用于刚注册的新用户。缺点是无法针对用户提供个性化的推荐。Popularity-based recommendation algorithm: Sort by popularity according to data such as clicks, page views, unique visitors, and sharing rates, and recommend them to users. The advantage of this algorithm is that it is simple and suitable for new users who have just registered. The disadvantage is that it cannot provide personalized recommendations for users.
基于模型的推荐算法:通过诸如机器学习的方法构建模型,然后用大量已有的用户行为数据、购买记录和用户的各种特征属性等对所构建的模型进行训练和数据拟合,然后向训练好的模型输入待推荐用户的各特征属性,模型输出最终的推荐结果。该算法的缺点是:需要大量的用户历史行为数据,并且需要反复的人工干预进行属性的组合和筛选(即特征工程)。同时,由于新闻的时效性,模型也需要反复的训练更新,以适应变化。Model-based recommendation algorithm: build a model through methods such as machine learning, and then use a large amount of existing user behavior data, purchase records and various user characteristics to train and fit the built model, and then apply the training A good model inputs the feature attributes of the user to be recommended, and the model outputs the final recommendation result. The disadvantage of this algorithm is that it requires a large amount of user historical behavior data, and requires repeated manual intervention for attribute combination and screening (ie, feature engineering). At the same time, due to the timeliness of news, the model also needs to be repeatedly trained and updated to adapt to changes.
由于金融资讯具有很强的时效性和多维度性,并且不同的投资者有不同的投资偏好和投资逻辑,上述的任何一种推荐算法都很难向投资者提供及时有用的投资信息和投资机会。Since financial information is highly timely and multi-dimensional, and different investors have different investment preferences and investment logics, it is difficult for any of the above recommendation algorithms to provide investors with timely and useful investment information and investment opportunities. .
因此有必要提供一种金融资讯智能推荐系统,以解决现有技术中资讯推荐功能单一、无法推荐同时具有多样性、新颖性、准确率高、个性化以及时效性强的资讯的问题。Therefore, it is necessary to provide an intelligent financial information recommendation system to solve the problem that the information recommendation function in the prior art is single, and it is impossible to recommend information with diversity, novelty, high accuracy, personalization and strong timeliness.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于提供一种金融资讯智能推荐系统,以解决现有技术中资讯推荐功能单一、无法推荐同时具有多样性、新颖性、准确率高、个性化以及时效性强的资讯的问题。The purpose of the present invention is to provide an intelligent financial information recommendation system to solve the problem of single information recommendation function in the prior art, unable to recommend information with diversity, novelty, high accuracy, personalization and strong timeliness.
为了解决现有技术中存在的问题,本发明提供了一种金融资讯智能推荐系统,包括:In order to solve the problems existing in the prior art, the present invention provides a financial information intelligent recommendation system, including:
新闻特征向量计算模块,配置为计算各新闻的特征向量;A news feature vector calculation module, configured to calculate the feature vector of each news;
用户特征向量计算模块,配置为计算各用户的特征向量;a user feature vector calculation module, configured to calculate the feature vector of each user;
多维度新闻池创建模块,配置为创建多个维度的新闻池,并对各新闻池中的新闻排序;The multi-dimensional news pool creation module is configured to create news pools of multiple dimensions and sort the news in each news pool;
新闻推荐模块,配置为计算各新闻池的采样概率,根据所述采样概率对各新闻池进行采样,将采样得到的新闻池中排在第一位的新闻推荐给用户。The news recommendation module is configured to calculate the sampling probability of each news pool, sample each news pool according to the sampling probability, and recommend the news ranked first in the sampled news pool to the user.
可选的,在所述金融资讯智能推荐系统中,计算各新闻的特征向量的方式为:Optionally, in the financial information intelligent recommendation system, the method of calculating the feature vector of each news is:
提取数据库中各新闻内所有的主题词;Extract all the subject words in each news in the database;
计算各主题词的权重和一周期内的动态逆文档频率;Calculate the weight of each subject term and the dynamic inverse document frequency within a period;
计算各新闻的特征向量V,特征向量V为N维向量,N为数据库中所有主题词总数,特征向量V的一位对应一个主题词,其中,各新闻中任一主题词对应的向量的位值等于该主题词的权重和该主题词一周期内的动态逆文档频率的乘积。Calculate the feature vector V of each news, the feature vector V is an N-dimensional vector, N is the total number of all the topic words in the database, and one bit of the feature vector V corresponds to a topic word, wherein, the bit of the vector corresponding to any topic word in each news The value is equal to the product of the weight of the keyword and the dynamic inverse document frequency within a period of the keyword.
可选的,在所述金融资讯智能推荐系统中,Optionally, in the financial information intelligent recommendation system,
一周期包括20天、30天或40天;A cycle consists of 20, 30 or 40 days;
逆文档频率为Inverse Document Frequency,即idf。The inverse document frequency is Inverse Document Frequency, which is idf.
可选的,在所述金融资讯智能推荐系统中,用户特征向量计算模块中,各用户的特征向量的计算方式为:Optionally, in the financial information intelligent recommendation system, in the user feature vector calculation module, the calculation method of each user's feature vector is:
V s=Normalize(V p)-B·Normalize(V d)·η(||V d|| 2)-E·V tV s =Normalize(V p )-B·Normalize(V d )·η(||V d || 2 )-E·V t ;
其中,V s为各用户的特征向量,V p为用户历史阅读新闻的特征向量、V d用户已点击不喜欢的新闻的特征向量,V t为用户已点击不喜欢的主题词的特征向量,||V d|| 2为特征向量V d的2-范数,Normalize(V p)和Normalize(V d)分别为特征向量V p和V d归一化后的向量,B和E为计算参数,η为新闻数量惩罚函数。 Among them, V s is the feature vector of each user, V p is the feature vector of the user's historical reading news, V d is the feature vector of the news that the user has clicked and disliked, and V t is the feature vector of the keyword that the user has clicked and disliked, ||V d || 2 is the 2-norm of the eigenvector V d , Normalize(V p ) and Normalize(V d ) are the normalized vectors of the eigenvectors V p and V d respectively, B and E are the calculation parameter, η is the penalty function of the number of news.
可选的,在所述金融资讯智能推荐系统中,Optionally, in the financial information intelligent recommendation system,
V p、V d和V t均为N维特征向量,N为数据库中所有主题词总数,特征向量的一位对应一个主题词; V p , V d and V t are all N-dimensional feature vectors, N is the total number of all subject words in the database, and one bit of the feature vector corresponds to a subject word;
V p特征向量的任一位等于用户历史阅读新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the Vp feature vectors is equal to the weight of the corresponding topic word in the user's historical reading news multiplied by the dynamic inverse document frequency of the topic word within a period;
V d特征向量的任一位等于用户已点击不喜欢的新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the V d feature vectors is equal to the weight of the corresponding topic word in the news that the user has clicked and disliked multiplied by the dynamic inverse document frequency of the topic word within a period;
V t特征向量的任一位等于用户已点击不喜欢的主题词的权重乘以该主题词一周期内的动态逆文档频率。 Any one of the V t feature vectors is equal to the weight of the keyword that the user has clicked and disliked multiplied by the dynamic inverse document frequency within a period of the keyword.
可选的,在所述金融资讯智能推荐系统中,多维度新闻池创建模块中维度类型包括但不限于:宏观、盘前、午评、资金流入流出、投资热点、热点主题新闻、点击榜、自选股、国际时政及财经、基金频道、港股及外围市场。Optionally, in the financial information intelligent recommendation system, the dimension types in the multi-dimensional news pool creation module include but are not limited to: macro, pre-market, afternoon comment, capital inflow and outflow, investment hotspot, hot topic news, click list, Self-selected stocks, international current affairs and finance, fund channels, Hong Kong stocks and external markets.
可选的,在所述金融资讯智能推荐系统中,Optionally, in the financial information intelligent recommendation system,
热点主题新闻池,先计算热点主题新闻池内各热点主题的热点概率,依据热点概率对所述热点主题新闻池内各热点主题进行采样,将采样得到的热点主题所对应的最新的新闻依次排序;Hot topic news pool, first calculate the hot probability of each hot topic in the hot topic news pool, sample each hot topic in the hot topic news pool according to the hot probability, and sort the latest news corresponding to the sampled hot topics in order;
点击榜新闻池中新闻按照用户的点击量进行排序;The news in the click list news pool is sorted according to the user's click volume;
时效性较强的新闻池中新闻按照新闻的发布时间倒序排序,即,发布时间越新的新闻越排在前面,时效性较强的新闻池包括盘前、午评和资金流入流出;The news in the news pool with strong timeliness is sorted in reverse order according to the release time of the news, that is, the news with the newer release time is ranked in the front, and the news pool with strong timeliness includes pre-market, afternoon comments, and capital inflows and outflows;
其余新闻池中新闻按照新闻的特征向量和用户的特征向量的相似度排序,依据相似度的高低依次排序。The news in the rest of the news pools are sorted according to the similarity of the feature vector of the news and the feature vector of the user, and sorted according to the degree of similarity.
可选的,在所述金融资讯智能推荐系统中,所述热点主题新闻池内各热点主题的热点概率的计算方式为:Optionally, in the financial information intelligent recommendation system, the calculation method of the hot probability of each hot topic in the hot topic news pool is:
计算各热点主题的热点概率;Calculate the hot probability of each hot topic;
对计算得到的热点主题的热点概率进行归一化处理;Normalize the calculated hotspot probability of hot topics;
其中各热点主题的热点概率的计算方式为:The calculation method of the hotspot probability of each hot topic is:
Figure PCTCN2021080155-appb-000001
Figure PCTCN2021080155-appb-000001
对热点主题的热点概率进行归一化处理的计算方式为:The calculation method of normalizing the hotspot probability of hot topics is as follows:
Figure PCTCN2021080155-appb-000002
Figure PCTCN2021080155-appb-000002
其中,K为待采样的热点主题的个数,j表示热度值排第j位的热点主题,
Figure PCTCN2021080155-appb-000003
为热点主题j未归一化的采样概率,power为计算参数,q(j)为热点主题j归一化后的采样概率,h j为热点主题j的热度值,h K为第K位的热点主题的热度值,即h K为热点主题中最低热度值所对应热点主题的热度值。
Among them, K is the number of hot topics to be sampled, j is the hot topic ranked jth in the hot value,
Figure PCTCN2021080155-appb-000003
is the unnormalized sampling probability of hot topic j, power is the calculation parameter, q(j) is the normalized sampling probability of hot topic j, h j is the hotness value of hot topic j, and h K is the K-th The hotness value of the hot topic, that is, h K is the hotness value of the hot topic corresponding to the lowest hot topic in the hot topic.
可选的,在所述金融资讯智能推荐系统中,计算各新闻池的采样概率的方式为:Optionally, in the financial information intelligent recommendation system, the method of calculating the sampling probability of each news pool is:
计算待推荐金融资讯用户与各新闻池中排在第一位的新闻的相似概率;Calculate the similarity probability between the financial information user to be recommended and the news ranked first in each news pool;
定义附加概率;define additional probabilities;
根据相似概率和附加概率计算得到各新闻池的采样概率;Calculate the sampling probability of each news pool according to the similarity probability and the additional probability;
根据所述采样概率对各新闻池进行采样;sampling each news pool according to the sampling probability;
将采样得到的新闻池中排在第一位的新闻推荐给用户。The news that ranks first in the sampled news pool is recommended to the user.
可选的,在所述金融资讯智能推荐系统中,Optionally, in the financial information intelligent recommendation system,
相似概率的计算方式为:The similarity probability is calculated as:
Figure PCTCN2021080155-appb-000004
Figure PCTCN2021080155-appb-000004
其中,i表示任一新闻池,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,sim i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似度,Pinitial i为预设的该新闻池的初始概率,C和ε为计算参数; Among them, i represents any news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, sim i is the financial information user to be recommended and the news ranked first in the news pool. The similarity of the news in bits, Primary i is the preset initial probability of the news pool, C and ε are the calculation parameters;
附加概率的计算方式为:The additional probability is calculated as:
Figure PCTCN2021080155-appb-000005
Figure PCTCN2021080155-appb-000005
其中,i表示任一新闻池,Padditional i为该新闻池归一化后的附加概率,m为当前可采样的新闻池总数; Among them, i represents any news pool, Padditional i is the normalized additional probability of the news pool, and m is the total number of current news pools that can be sampled;
采样概率的计算方式为:The sampling probability is calculated as:
Figure PCTCN2021080155-appb-000006
Figure PCTCN2021080155-appb-000006
其中,i表示任一新闻池,P i为该新闻池的采样概率,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,Padditional i为该新闻池归一化后的附加概率。 Among them, i represents any news pool, P i is the sampling probability of the news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, Padditional i is the news pool return The normalized additional probability.
在本发明所提供的金融资讯智能推荐系统中,通过建立符合金融领域特点和投资需求的各种维度的新闻池,实现了向用户推送多维度、多层次和多样性的新闻;本发明通过采样概率的统计学方法,从多个维度(多个新闻池)动态地对新闻进行概率采样,采样概率可以动态的由用户相似度(即用户偏好)、新闻热度以及业务逻辑等因素决定,这样就能够在向用户推送符合用户偏好和兴趣新闻的同时,又能向用户推送准确且及时有用的投资信息和投资机会。In the financial information intelligent recommendation system provided by the present invention, by establishing news pools of various dimensions that meet the characteristics of the financial field and investment needs, it is possible to push multi-dimensional, multi-level and diverse news to users; The statistical method of probability can dynamically sample news from multiple dimensions (multiple news pools). The sampling probability can be dynamically determined by factors such as user similarity (ie user preference), news popularity and business logic. While pushing news that conforms to user preferences and interests to users, it can also push accurate, timely and useful investment information and investment opportunities to users.
附图说明Description of drawings
图1为本发明实施例提供的金融资讯智能推荐系统的模块图;1 is a block diagram of a financial information intelligent recommendation system provided by an embodiment of the present invention;
图2为本发明实施例提供的金融资讯智能推荐系统中的推荐流程图;FIG. 2 is a recommendation flowchart in the financial information intelligent recommendation system provided by an embodiment of the present invention;
具体实施方式Detailed ways
下面将结合示意图对本发明的具体实施方式进行更详细的描述。根据下列描述,本发明的优点和特征将更清楚。需说明的是,附图均采用非常简化的形式且均使用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。The specific embodiments of the present invention will be described in more detail below with reference to the schematic diagrams. The advantages and features of the present invention will become more apparent from the following description. It should be noted that, the accompanying drawings are all in a very simplified form and in inaccurate scales, and are only used to facilitate and clearly assist the purpose of explaining the embodiments of the present invention.
在下文中,如果本文所述的方法包括一系列步骤,则本文所呈现的这些步骤的顺序并非必须是可执行这些步骤的唯一顺序,且一些所述的步骤可被省略和/或一些本文未描述的其他步骤可被添加到该方法中。In the following, if a method described herein includes a series of steps, the order of the steps presented herein is not necessarily the only order in which the steps may be performed, and some of the steps described may be omitted and/or some not described herein Additional steps can be added to the method.
由于金融资讯具有很强的时效性和多维度性,并且不同的投资者有不同的投资偏好和投资逻辑,现有推荐算法都很难向投资者提供及时有用的投资信息和投资机会。Because financial information is highly timely and multi-dimensional, and different investors have different investment preferences and investment logics, it is difficult for existing recommendation algorithms to provide investors with timely and useful investment information and investment opportunities.
因此有必要提供一种金融资讯智能推荐系统,如图1和2所示,图1为本发明实施例提供的金融资讯智能推荐系统的模块图;图2为本发明实施例提供的金融资讯智能推荐系统中的推荐流程图,所述金融资讯智能推荐系统包括:Therefore, it is necessary to provide a financial information intelligent recommendation system, as shown in Figures 1 and 2. Figure 1 is a block diagram of a financial information intelligent recommendation system provided by an embodiment of the present invention; Figure 2 is a financial information intelligent recommendation system provided by an embodiment of the present invention. The recommendation flow chart in the recommendation system, the financial information intelligent recommendation system includes:
新闻特征向量计算模块,配置为计算各新闻的特征向量;A news feature vector calculation module, configured to calculate the feature vector of each news;
用户特征向量计算模块,配置为计算各用户的特征向量;a user feature vector calculation module, configured to calculate the feature vector of each user;
多维度新闻池创建模块,配置为创建多个维度的新闻池,并对各新闻池中的新闻排序;The multi-dimensional news pool creation module is configured to create news pools of multiple dimensions and sort the news in each news pool;
新闻推荐模块,配置为计算各新闻池的采样概率,根据所述采样概率对各新闻池进行采样,将采样得到的新闻池中排在第一位的新闻推荐给用户。The news recommendation module is configured to calculate the sampling probability of each news pool, sample each news pool according to the sampling probability, and recommend the news ranked first in the sampled news pool to the user.
本发明解决了现有技术中资讯推荐功能单一、无法推荐同时具有多样性、新颖性、准确率高、个性化以及时效性强的资讯的问题,本发明既能向用户主动推送符合用户投资偏好的新闻资讯,又能向用户提供各种内容丰富和及时的信息,及时捕捉瞬息万变的投资机会。The present invention solves the problem of single information recommendation function in the prior art, unable to recommend information with diversity, novelty, high accuracy, personalization and strong timeliness. It can also provide users with a variety of content-rich and timely information to capture the ever-changing investment opportunities in a timely manner.
具体的,计算各新闻的特征向量的方式为:Specifically, the way to calculate the feature vector of each news is:
通过TextRank等算法提取数据库中各新闻内所有的主题词;Extract all the subject words in each news in the database through algorithms such as TextRank;
计算各主题词的权重和一周期内的动态逆文档频率,其中,各主题词的权重可以根据计算所得或经验所得,一周期包括20天、30天或40天,优选一周期为30天,逆文档频率为Inverse Document Frequency,即idf;Calculate the weight of each subject word and the dynamic inverse document frequency in a period, wherein the weight of each subject word can be obtained according to calculation or experience, a period includes 20 days, 30 days or 40 days, preferably a period is 30 days, The inverse document frequency is Inverse Document Frequency, ie idf;
最后计算各新闻的特征向量V,特征向量V为N维向量,N为数据库中所有主题词总数,特征向量V的一位对应一个主题词,其中,各新闻中任一主题词对应的向量的位值等于该主题词的权重和该主题词一周期内的动态逆文档频率的乘积,用公式表示即为:V[w]=theme_weight(w)×idf(w),其中V[w]为各新闻中任一主题词对应的向量的位值,theme_weight(w)为该主题词的权重,idf(w)为该主题词一周期内的动态逆文档频率。Finally, the feature vector V of each news is calculated, the feature vector V is an N-dimensional vector, N is the total number of all subject words in the database, and one bit of the feature vector V corresponds to a subject word, wherein, the vector corresponding to any subject word in each news The bit value is equal to the product of the weight of the topic word and the dynamic inverse document frequency within a period of the topic word, which is expressed by the formula: V[w]=theme_weight(w)×idf(w), where V[w] is The bit value of the vector corresponding to any topic word in each news, theme_weight(w) is the weight of the topic word, and idf(w) is the dynamic inverse document frequency of the topic word within a period.
进一步的,用户特征向量计算模块中,各用户的特征向量的计算方式为:Further, in the user feature vector calculation module, the calculation method of the feature vector of each user is:
V s=Normalize(V p)-B·Normalize(V d)·η(||V d|| 2)-E·V tV s= Normalize(V p )-B·Normalize(V d )·η(||V d || 2 )-E·V t ;
其中,V s为各用户的特征向量,V p为用户历史阅读新闻的特征向量、V d用户已点击不喜欢的新闻的特征向量,V t为用户已点击不喜欢的主题词的特征向量,||V d|| 2为特征向量V d的2-范数,Normalize(V p)和Normalize(V d)分别为特征向量V p和V d归一化后的向量,B和E为计算参数,η为新闻数量惩罚函数。 Among them, V s is the feature vector of each user, V p is the feature vector of the user's historical reading news, V d is the feature vector of the news that the user has clicked and disliked, and V t is the feature vector of the keyword that the user has clicked and disliked, ||V d || 2 is the 2-norm of the feature vector V d , Normalize(V p ) and Normalize(V d ) are the normalized vectors of the feature vectors V p and V d respectively, B and E are the calculation parameter, η is the penalty function of the number of news.
进一步的,||V d|| 2的计算公式为: Further, the calculation formula of ||V d || 2 is:
Figure PCTCN2021080155-appb-000007
其中,v d1,v d2,…,v dr为V d的r个分量;
Figure PCTCN2021080155-appb-000007
Among them, v d1 , v d2 ,..., v dr are r components of V d ;
Normalize(V p)和Normalize(V d)的计算公式为: The calculation formulas of Normalize(V p ) and Normalize(V d ) are:
Figure PCTCN2021080155-appb-000008
Figure PCTCN2021080155-appb-000008
Normalize(V d)=[v d1/||V d|| 2,v d2/||V d|| 2,…,v dr/||V d|| 2]; Normalize(V d )=[v d1 /||V d || 2 , v d2 /||V d || 2 , …, v dr /||V d || 2 ];
其中,v p1,v p2,…,v pr为V p的r个分量,v d1,v d2,…,v dr为V d的r个分量,||V p|| 2为特征向量V p的2-范数,||V d|| 2为特征向量V d的2-范数。 Among them, v p1 , v p2 ,..., v pr are the r components of V p , v d1 , v d2 ,..., v dr are the r components of V d , ||V p || 2 is the feature vector V p The 2-norm of , ||V d || 2 is the 2-norm of the feature vector V d .
η(||V d|| 2)的计算公式为: The calculation formula of η(||V d || 2 ) is:
Figure PCTCN2021080155-appb-000009
Figure PCTCN2021080155-appb-000009
其中,θ和α为计算参数,设计该惩罚函数的原因是:当用户刚开始使用不喜欢新闻的功能时,只点了很少的几个新闻,这时由于向量范数很小,归一化后每一个分量偏大,导致一开始屏蔽力度太强,所以需要让向量乘以一个很小的惩罚函数,以使得每个分量变小一点。而随着不喜欢新闻功能使用的增多, 不喜欢新闻的向量的范数随之增大,从而η(||V d|| 2)趋近于1,就会很快减小惩罚力度。 Among them, θ and α are the calculation parameters. The reason for designing the penalty function is: when the user first uses the function of disliking news, he only clicks a few news. At this time, due to the small vector norm, the normalization After transformation, each component is too large, resulting in too strong shielding at the beginning, so it is necessary to multiply the vector by a small penalty function to make each component smaller. With the increase in the use of the dislike news function, the norm of the dislike news vector increases, so that η(||V d || 2 ) approaches 1, and the punishment will be reduced quickly.
较佳的,各用户中的用户历史阅读新闻、用户已点击不喜欢的新闻和用户已点击不喜欢的主题词会在更新用户的特征向量前衰减,或者随着时间衰减,从而给新的用户行为更大的权重,以便适应用户阅读兴趣的变化,并且,还可以将已经推送过的新闻、用户已点击不喜欢的新闻和用户已点击不喜欢的主题词所对应的新闻在推荐过程中去除。Preferably, among the users, the user's historical reading news, the news that the user has clicked and disliked, and the subject words that the user has clicked and disliked will decay before updating the user's feature vector, or decay over time, so that new users can The behavior is given a greater weight in order to adapt to the changes in the user's reading interest, and the news corresponding to the news that has been pushed, the news that the user has clicked and disliked, and the subject words that the user has clicked dislike can also be removed from the recommendation process. .
再进一步的,在所述金融资讯智能推荐系统中,Still further, in the financial information intelligent recommendation system,
下列V p、V d和V t均为N维特征向量,N为数据库中所有主题词总数,特征向量的一位对应一个主题词; The following V p , V d and V t are all N-dimensional feature vectors, N is the total number of all subject words in the database, and one bit of the feature vector corresponds to a subject word;
V p特征向量的任一位等于用户历史阅读新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the Vp feature vectors is equal to the weight of the corresponding topic word in the user's historical reading news multiplied by the dynamic inverse document frequency of the topic word within a period;
V d特征向量的任一位等于用户已点击不喜欢的新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the V d feature vectors is equal to the weight of the corresponding topic word in the news that the user has clicked and disliked multiplied by the dynamic inverse document frequency of the topic word within a period;
V t特征向量的任一位等于用户已点击不喜欢的主题词的权重乘以该主题词一周期内的动态逆文档频率。 Any one of the V t feature vectors is equal to the weight of the keyword that the user has clicked and disliked multiplied by the dynamic inverse document frequency within a period of the keyword.
在所述金融资讯智能推荐系统中,为了向用户推送内容丰富和不同维度的新闻,本发明通过分类算法或分类器自动将新闻分为多个大类,因此维度新闻池创建模块中维度类型包括但不限于:宏观、盘前、午评、资金流入流出、投资热点、热点主题新闻、点击榜、自选股、国际时政及财经、基金频道、港股及外围市场。In the financial information intelligent recommendation system, in order to push news with rich content and different dimensions to users, the present invention automatically divides news into multiple categories through a classification algorithm or classifier. Therefore, the dimension types in the dimension news pool creation module include: But not limited to: macro, pre-market, afternoon review, capital inflows and outflows, investment hotspots, hot topic news, click list, self-selected stocks, international current affairs and finance, fund channels, Hong Kong stocks and external markets.
较佳的,热点主题新闻池中,先计算热点主题新闻池内各热点主题的热点概率,依据热点概率对所述热点主题新闻池内各热点主题进行采样,将采样得到的热点主题所对应的最新的新闻依次排序,热点主题的热度值越高,采样概率越大,该热点主题的新闻被优先推荐的可能性就越大;点击榜新闻池中新闻按照用户的点击量进行排序,点击量越高的新闻表明该新闻越受欢迎或者越重要,排序越靠前;时效性较强的新闻池中新闻按照新闻的发布时间倒序排序,即,发布时间越新的新闻越排在前面,以保证将重大财经消息可以尽快推送给用户,时效性较强的新闻池包括盘前、午评和资金流入流出;其余新闻池中新 闻按照新闻的特征向量和用户的特征向量的相似度排序,依据相似度的高低依次排序,相似度越大的新闻越符合用户的阅读或投资偏好,排序越靠前,以使新闻推荐模块尽可能地优先将这些新闻推送给用户,较佳的,每一新闻池中的每一篇新闻都通过新闻特征向量和用户特征向量计算每一篇新闻与每一个用户的余弦相似度。Preferably, in the hot topic news pool, the hot probability of each hot topic in the hot topic news pool is first calculated, and each hot topic in the hot topic news pool is sampled according to the hot probability, and the latest hot topic corresponding to the sampled hot topic is sampled. The news is sorted in order. The higher the popularity value of the hot topic, the greater the sampling probability, and the greater the possibility that the news of the hot topic will be recommended first; the news in the click list news pool is sorted according to the user's click volume, and the higher the click volume The news indicates that the news is more popular or important, and the ranking is higher; the news in the news pool with strong timeliness is sorted in reverse order according to the release time of the news, that is, the news with the newer release time is ranked first, so as to ensure that Major financial news can be pushed to users as soon as possible. News pools with strong timeliness include pre-market, afternoon reviews, and capital inflows and outflows; the news in other news pools are sorted according to the similarity between the feature vector of the news and the feature vector of the user, and the similarity is based on the similarity. The higher the similarity is, the better the news is in line with the user's reading or investment preferences, and the higher the ranking is, so that the news recommendation module can give priority to pushing these news to the user as much as possible. For each news of , the cosine similarity between each news and each user is calculated through the news feature vector and the user feature vector.
进一步的,所述热点主题新闻池内各热点主题的热点概率的计算方式为:Further, the calculation method of the hot probability of each hot topic in the hot topic news pool is:
计算各热点主题的热点概率;Calculate the hot probability of each hot topic;
对计算得到的热点主题的热点概率进行归一化处理;Normalize the calculated hotspot probability of hot topics;
其中,各热点主题的热点概率的计算方式为:Among them, the calculation method of the hotspot probability of each hot topic is:
Figure PCTCN2021080155-appb-000010
Figure PCTCN2021080155-appb-000010
对热点主题的热点概率进行归一化处理的计算方式为:The calculation method of normalizing the hotspot probability of hot topics is as follows:
Figure PCTCN2021080155-appb-000011
Figure PCTCN2021080155-appb-000011
其中,K为待采样的热点主题的个数,j表示热度值排第j位的热点主题,
Figure PCTCN2021080155-appb-000012
为热点主题j未归一化的采样概率,power为计算参数,q(j)为热点主题j归一化后的采样概率,h j为热点主题j的热度值,h K为第K位的热点主题的热度值,即h K为热点主题中最低热度值所对应热点主题的热度值。
Among them, K is the number of hot topics to be sampled, j is the hot topic ranked jth in the hot value,
Figure PCTCN2021080155-appb-000012
is the unnormalized sampling probability of hot topic j, power is the calculation parameter, q(j) is the normalized sampling probability of hot topic j, h j is the hotness value of hot topic j, and h K is the K-th The hotness value of the hot topic, that is, h K is the hotness value of the hot topic corresponding to the lowest hot topic in the hot topic.
可选的,在所述金融资讯智能推荐系统中,新闻推荐模块相当于整个金融资讯智能推荐系统的大脑中枢,新闻推荐模块采用概率采样技术结合时间决定最终何时、向何用户推荐何种新闻。进一步的,请继续参考图2,计算各新闻池的采样概率的方式为:Optionally, in the financial information intelligent recommendation system, the news recommendation module is equivalent to the brain center of the entire financial information intelligent recommendation system, and the news recommendation module adopts probability sampling technology combined with time to determine when and to which users what news is finally recommended. . Further, please continue to refer to Figure 2. The way to calculate the sampling probability of each news pool is:
计算待推荐金融资讯用户与各新闻池中排在第一位的新闻的相似概率;Calculate the similarity probability between the financial information user to be recommended and the news ranked first in each news pool;
定义附加概率;define additional probabilities;
根据相似概率和附加概率计算得到各新闻池的采样概率;Calculate the sampling probability of each news pool according to the similarity probability and the additional probability;
根据所述采样概率对各新闻池进行采样;sampling each news pool according to the sampling probability;
将采样得到的新闻池中排在第一位的新闻推荐给用户。The news that ranks first in the sampled news pool is recommended to the user.
可选的,在所述金融资讯智能推荐系统中,Optionally, in the financial information intelligent recommendation system,
相似概率的计算方式为:The similarity probability is calculated as:
Figure PCTCN2021080155-appb-000013
Figure PCTCN2021080155-appb-000013
其中,i表示任一新闻池,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,sim i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似度,Pinitial i为预设的该新闻池的初始概率,Pintial i具体值根据业务场景和各新闻池的重要性决定,C和ε为计算参数;各新闻池的采样概率会随其排在第一位的新闻的相似概率的增大而快速增大,也就是如果某个新闻池排在第一位的新闻的相似概率明显大于其他新闻池排在第一位的新闻的相似概率,则该新闻池的该篇新闻会被优先采样到,因为相似度大表明该篇新闻更符合用户的阅读和投资偏好。 Among them, i represents any news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, sim i is the financial information user to be recommended and the news ranked first in the news pool. The similarity of bits of news, Pinitial i is the preset initial probability of the news pool, the specific value of Pintial i is determined according to the business scenario and the importance of each news pool, C and ε are calculation parameters; the sampling probability of each news pool will be It increases rapidly with the increase of the similarity probability of the first-ranked news, that is, if the similarity probability of the first-ranked news of a certain news pool is significantly larger than that of the first-ranked news of other news pools. If the probability is similar, the news in the news pool will be preferentially sampled, because a large similarity indicates that the news is more in line with the user's reading and investment preferences.
进一步的,由于像热点主题新闻池和点击榜新闻池中的新闻,尽管可能与用户特征向量的相似度不是很大,为了让用户及时了解当前热点或者让用户发现新的兴趣点,也需要向用户积极推送该类新闻,即本发明定义了附加概率,以增加这类新闻的推荐率。Further, because the news in the hot topic news pool and the click list news pool may not be very similar to the user's feature vector, in order to let the user know the current hot spots in time or let the user discover new points of interest, it is also necessary to send Users actively push this type of news, that is, the present invention defines an additional probability to increase the recommendation rate of this type of news.
附加概率的计算方式为:The additional probability is calculated as:
Figure PCTCN2021080155-appb-000014
Figure PCTCN2021080155-appb-000014
其中,i表示任一新闻池,Padditional i为该新闻池归一化后的附加概率,m为当前可采样的新闻池总数;附加概率为归一化后概率,所以这一附加概率不会受其他新闻池采样概率的影响,保证了在任何情况下这类新闻都能积极地向用户推送,其推送力度主要取决于这一附加概率。 Among them, i represents any news pool, Padditional i is the normalized additional probability of the news pool, m is the total number of current news pools that can be sampled; the additional probability is the normalized probability, so this additional probability will not be affected by The influence of the sampling probability of other news pools ensures that such news can be actively pushed to users under any circumstances, and its push strength mainly depends on this additional probability.
采样概率的计算方式为:The sampling probability is calculated as:
Figure PCTCN2021080155-appb-000015
Figure PCTCN2021080155-appb-000015
其中,i表示任一新闻池,P i为该新闻池的采样概率,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,Padditional i为该新闻池归一化后的附加概率,得到各新闻池的采样概率后,根据所述采样概率对各新闻池进行采样,将采样得到的新闻池中排在第一位的新闻推荐给用户。 Among them, i represents any news pool, P i is the sampling probability of the news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, Padditional i is the news pool return After normalizing the additional probability, after the sampling probability of each news pool is obtained, each news pool is sampled according to the sampling probability, and the news ranked first in the sampled news pool is recommended to the user.
最后,新闻推荐模块按照以下逻辑和步骤向用户推荐新闻:对于时效性较 强的新闻池(如盘前、午评以及下周展望等),设定特定的推送时间,如盘前推送时间为股市交易日0点到开盘前,午评为股市交易日11:30-13:00,在这些时间段,优先向用户推送这些新闻,其他时间段不向用户推送这些新闻;然后,根据前述计算好的各个新闻池的采样概率对新闻池进行采样,如果某个新闻池被采样到,则将该新闻池中排在第一位的新闻优先推送给用户。Finally, the news recommendation module recommends news to users according to the following logic and steps: For news pools with strong timeliness (such as pre-market, afternoon comment, and next week's outlook, etc.), set a specific push time, such as the pre-market push time is From 0:00 on the stock market trading day to before the opening, the afternoon is rated as 11:30-13:00 on the stock market trading day. During these time periods, the news will be pushed to users first, and these news will not be pushed to users in other time periods; then, according to the above calculation According to the sampling probability of each news pool, the news pool is sampled. If a news pool is sampled, the first news in the news pool will be pushed to the user first.
请参考图3,图3为本发明实施例提供的金融资讯推荐展示图,从图中可以看出,本发明所提供的推荐系统推送了既符合用户兴趣偏好(房价和房地产)的新闻资讯,又向用户推送了各种内容丰富和及时的新闻资讯(如当前的热门事件、最新重要宏观经济数据和突发金融事件等)。Please refer to FIG. 3 . FIG. 3 is a display diagram of financial information recommendation provided by an embodiment of the present invention. As can be seen from the figure, the recommendation system provided by the present invention pushes news information that meets the user's interest preference (house price and real estate), It also pushes various content-rich and timely news information (such as current hot events, the latest important macroeconomic data and unexpected financial events, etc.) to users.
综上,在本发明所提供的金融资讯智能推荐系统中,通过建立符合金融领域特点和投资需求的各种维度的新闻池,实现了向用户推送多维度、多层次和多样性的新闻;本发明通过采样概率的统计学方法,从多个维度(多个新闻池)动态地对新闻进行概率采样,采样概率可以动态的由用户相似度(即用户偏好)、新闻热度以及业务逻辑等因素决定,这样就能够在向用户推送符合用户偏好和兴趣新闻的同时,又能向用户推送准确且及时有用的投资信息和投资机会。To sum up, in the financial information intelligent recommendation system provided by the present invention, by establishing news pools of various dimensions that meet the characteristics of the financial field and investment needs, it is possible to push multi-dimensional, multi-level and diverse news to users; The invention dynamically samples news from multiple dimensions (multiple news pools) through the statistical method of sampling probability, and the sampling probability can be dynamically determined by factors such as user similarity (ie user preference), news popularity and business logic. , so that it can push accurate, timely and useful investment information and investment opportunities to users while pushing news that conforms to users' preferences and interests.
上述仅为本发明的优选实施例而已,并不对本发明起到任何限制作用。任何所属技术领域的技术人员,在不脱离本发明的技术方案的范围内,对本发明揭露的技术方案和技术内容做任何形式的等同替换或修改等变动,均属未脱离本发明的技术方案的内容,仍属于本发明的保护范围之内。The above are only preferred embodiments of the present invention, and do not have any limiting effect on the present invention. Any person skilled in the art, within the scope of not departing from the technical solution of the present invention, makes any form of equivalent replacement or modification to the technical solution and technical content disclosed in the present invention, all belong to the technical solution of the present invention. content still falls within the protection scope of the present invention.

Claims (8)

  1. 一种金融资讯智能推荐系统,其特征在于,包括:An intelligent recommendation system for financial information, comprising:
    新闻特征向量计算模块,配置为计算各新闻的特征向量;A news feature vector calculation module, configured to calculate the feature vector of each news;
    用户特征向量计算模块,配置为计算各用户的特征向量;a user feature vector calculation module, configured to calculate the feature vector of each user;
    多维度新闻池创建模块,配置为创建多个维度的新闻池,并对各新闻池中的新闻排序;The multi-dimensional news pool creation module is configured to create news pools of multiple dimensions and sort the news in each news pool;
    新闻推荐模块,配置为计算各新闻池的采样概率,计算各新闻池的采样概率的方式为:计算待推荐金融资讯用户与各新闻池中排在第一位的新闻的相似概率,定义附加概率,根据相似概率和附加概率计算得到各新闻池的采样概率,根据所述采样概率对各新闻池进行采样,将采样得到的新闻池中排在第一位的新闻推荐给用户;The news recommendation module is configured to calculate the sampling probability of each news pool. The method of calculating the sampling probability of each news pool is: calculate the similarity probability between the financial information user to be recommended and the news ranked first in each news pool, and define the additional probability , calculate the sampling probability of each news pool according to the similarity probability and the additional probability, sample each news pool according to the sampling probability, and recommend the news ranked first in the sampled news pool to the user;
    其中,相似概率的计算方式为:Among them, the calculation method of similarity probability is:
    Figure PCTCN2021080155-appb-100001
    Figure PCTCN2021080155-appb-100001
    其中,i表示任一新闻池,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,sim i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似度,Pinitial i为预设的该新闻池的初始概率,C和ε为计算参数; Among them, i represents any news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, sim i is the financial information user to be recommended and the news ranked first in the news pool. The similarity of the news in bits, Primary i is the preset initial probability of the news pool, C and ε are the calculation parameters;
    附加概率的计算方式为:The additional probability is calculated as:
    Figure PCTCN2021080155-appb-100002
    Figure PCTCN2021080155-appb-100002
    其中,i表示任一新闻池,Padditional i为该新闻池归一化后的附加概率,m为当前可采样的新闻池总数; Among them, i represents any news pool, Padditional i is the normalized additional probability of the news pool, and m is the total number of current news pools that can be sampled;
    采样概率的计算方式为:The sampling probability is calculated as:
    Figure PCTCN2021080155-appb-100003
    Figure PCTCN2021080155-appb-100003
    其中,i表示任一新闻池,P i为该新闻池的采样概率,Padjust i为待推荐金融资讯用户与该新闻池中排在第一位的新闻的相似概率,Padditional i为该新闻池归一化后的附加概率。 Among them, i represents any news pool, P i is the sampling probability of the news pool, Padjust i is the similarity probability between the financial information user to be recommended and the news ranked first in the news pool, Padditional i is the news pool return The normalized additional probability.
  2. 如权利要求1所述的金融资讯智能推荐系统,其特征在于,计算各新闻 的特征向量的方式为:The financial information intelligent recommendation system as claimed in claim 1, is characterized in that, the mode of calculating the feature vector of each news is:
    提取数据库中各新闻内所有的主题词;Extract all the subject words in each news in the database;
    计算各主题词的权重和一周期内的动态逆文档频率;Calculate the weight of each subject term and the dynamic inverse document frequency within a period;
    计算各新闻的特征向量V,特征向量V为N维向量,N为数据库中所有主题词总数,特征向量V的一位对应一个主题词,其中,各新闻中任一主题词对应的向量的位值等于该主题词的权重和该主题词一周期内的动态逆文档频率的乘积。Calculate the feature vector V of each news, the feature vector V is an N-dimensional vector, N is the total number of all the topic words in the database, and one bit of the feature vector V corresponds to a topic word, wherein, the bit of the vector corresponding to any topic word in each news The value is equal to the product of the weight of the keyword and the dynamic inverse document frequency within a period of the keyword.
  3. 如权利要求2所述的金融资讯智能推荐系统,其特征在于,The financial information intelligent recommendation system according to claim 2, wherein,
    一周期包括20天、30天或40天;A cycle consists of 20, 30 or 40 days;
    逆文档频率为Inverse Document Frequency,即idf。The inverse document frequency is Inverse Document Frequency, which is idf.
  4. 如权利要求3所述的金融资讯智能推荐系统,其特征在于,用户特征向量计算模块中,各用户的特征向量的计算方式为:The financial information intelligent recommendation system according to claim 3, wherein, in the user feature vector calculation module, the calculation method of the feature vector of each user is:
    V s=Normalize(V p)-B·Normalize(V d)·η(||V d|| 2)-E·V tV s =Normalize(V p )-B·Normalize(V d )·η(||V d || 2 )-E·V t ;
    其中,V s为各用户的特征向量,V p为用户历史阅读新闻的特征向量、V d用户已点击不喜欢的新闻的特征向量,V t为用户已点击不喜欢的主题词的特征向量,||V d|| 2为特征向量V d的2-范数,Normalize(V p)和Normalize(V d)分别为特征向量V p和V d归一化后的向量,B和E为计算参数,η为新闻数量惩罚函数。 Among them, V s is the feature vector of each user, V p is the feature vector of the user's historical reading news, V d is the feature vector of the news that the user has clicked and disliked, and V t is the feature vector of the keyword that the user has clicked and disliked, ||V d || 2 is the 2-norm of the feature vector V d , Normalize(V p ) and Normalize(V d ) are the normalized vectors of the feature vectors V p and V d respectively, B and E are the calculation parameter, η is the penalty function of the number of news.
  5. 如权利要求4所述的金融资讯智能推荐系统,其特征在于,The financial information intelligent recommendation system according to claim 4, wherein,
    V p、V d和V t均为N维特征向量,N为数据库中所有主题词总数,特征向量的一位对应一个主题词; V p , V d and V t are all N-dimensional feature vectors, N is the total number of all subject words in the database, and one bit of the feature vector corresponds to a subject word;
    V p特征向量的任一位等于用户历史阅读新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the Vp feature vectors is equal to the weight of the corresponding topic word in the user's historical reading news multiplied by the dynamic inverse document frequency of the topic word within a period;
    V d特征向量的任一位等于用户已点击不喜欢的新闻中对应主题词的权重乘以该主题词一周期内的动态逆文档频率; Any one of the V d feature vectors is equal to the weight of the corresponding topic word in the news that the user has clicked and disliked multiplied by the dynamic inverse document frequency of the topic word within a period;
    V t特征向量的任一位等于用户已点击不喜欢的主题词的权重乘以该主题词一周期内的动态逆文档频率。 Any one of the V t feature vectors is equal to the weight of the keyword that the user has clicked and disliked multiplied by the dynamic inverse document frequency within a period of the keyword.
  6. 如权利要求1所述的金融资讯智能推荐系统,其特征在于,多维度新闻池创建模块中维度类型包括但不限于:宏观、盘前、午评、资金流入流出、投资热点、热点主题新闻、点击榜、自选股、国际时政及财经、基金频道、港股 及外围市场。The financial information intelligent recommendation system according to claim 1, wherein the dimension types in the multi-dimensional news pool creation module include but are not limited to: macro, pre-market, afternoon comment, capital inflow and outflow, investment hotspot, hot topic news, Click list, self-selected stocks, international current affairs and finance, fund channel, Hong Kong stocks and external markets.
  7. 如权利要求6所述的金融资讯智能推荐系统,其特征在于,The financial information intelligent recommendation system according to claim 6, wherein,
    热点主题新闻池,先计算热点主题新闻池内各热点主题的热点概率,依据热点概率对所述热点主题新闻池内各热点主题进行采样,将采样得到的热点主题所对应的最新的新闻依次排序;Hot topic news pool, first calculate the hot probability of each hot topic in the hot topic news pool, sample each hot topic in the hot topic news pool according to the hot probability, and sort the latest news corresponding to the sampled hot topics in order;
    点击榜新闻池中新闻按照用户的点击量进行排序;The news in the click list news pool is sorted according to the user's click volume;
    时效性较强的新闻池中新闻按照新闻的发布时间倒序排序,即,发布时间越新的新闻越排在前面,时效性较强的新闻池包括盘前、午评和资金流入流出;The news in the news pool with strong timeliness is sorted in reverse order according to the release time of the news, that is, the news with the newer release time is ranked in the front, and the news pool with strong timeliness includes pre-market, afternoon comments, and capital inflows and outflows;
    其余新闻池中新闻按照新闻的特征向量和用户的特征向量的相似度排序,依据相似度的高低依次排序。The news in the rest of the news pools are sorted according to the similarity of the feature vector of the news and the feature vector of the user, and sorted according to the degree of similarity.
  8. 如权利要求7所述的金融资讯智能推荐系统,其特征在于,所述热点主题新闻池内各热点主题的热点概率的计算方式为:The financial information intelligent recommendation system according to claim 7, wherein the calculation method of the hot probability of each hot topic in the hot topic news pool is:
    计算各热点主题的热点概率;Calculate the hot probability of each hot topic;
    对计算得到的热点主题的热点概率进行归一化处理;Normalize the calculated hotspot probability of hot topics;
    其中各热点主题的热点概率的计算方式为:The calculation method of the hotspot probability of each hot topic is:
    Figure PCTCN2021080155-appb-100004
    Figure PCTCN2021080155-appb-100004
    对热点主题的热点概率进行归一化处理的计算方式为:The calculation method of normalizing the hotspot probability of hot topics is as follows:
    Figure PCTCN2021080155-appb-100005
    Figure PCTCN2021080155-appb-100005
    其中,K为待采样的热点主题的个数,j表示热度值排第j位的热点主题,
    Figure PCTCN2021080155-appb-100006
    为热点主题j未归一化的采样概率,power为计算参数,q(j)为热点主题j归一化后的采样概率,h j为热点主题j的热度值,h K为第K位的热点主题的热度值,即h K为热点主题中最低热度值所对应热点主题的热度值。
    Among them, K is the number of hot topics to be sampled, j is the hot topic ranked jth in the hot value,
    Figure PCTCN2021080155-appb-100006
    is the unnormalized sampling probability of hot topic j, power is the calculation parameter, q(j) is the normalized sampling probability of hot topic j, h j is the hotness value of hot topic j, and h K is the K-th The hotness value of the hot topic, that is, h K is the hotness value of the hot topic corresponding to the lowest hot topic in the hot topic.
PCT/CN2021/080155 2020-12-15 2021-03-11 Intelligent financial information recommendation system WO2022126873A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011469913.XA CN112231593B (en) 2020-12-15 2020-12-15 Financial information intelligent recommendation system
CN202011469913.X 2020-12-15

Publications (1)

Publication Number Publication Date
WO2022126873A1 true WO2022126873A1 (en) 2022-06-23

Family

ID=74123585

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/080155 WO2022126873A1 (en) 2020-12-15 2021-03-11 Intelligent financial information recommendation system

Country Status (2)

Country Link
CN (1) CN112231593B (en)
WO (1) WO2022126873A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116304128A (en) * 2023-03-01 2023-06-23 广西泛华于成信息科技有限公司 Multimedia information recommendation system based on big data
CN116932920A (en) * 2023-09-18 2023-10-24 青岛理工大学 Accurate healthy science popularization data recommendation method based on big data

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112231593B (en) * 2020-12-15 2021-03-12 上海朝阳永续信息技术股份有限公司 Financial information intelligent recommendation system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831234A (en) * 2012-08-31 2012-12-19 北京邮电大学 Personalized news recommendation device and method based on news content and theme feature
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
US20160055541A1 (en) * 2014-08-21 2016-02-25 Everyday Health Inc. Personalized recommendation system and methods using automatic identification of user preferences
CN107025310A (en) * 2017-05-17 2017-08-08 长春嘉诚信息技术股份有限公司 A kind of automatic news in real time recommends method
CN112231593A (en) * 2020-12-15 2021-01-15 上海朝阳永续信息技术股份有限公司 Financial information intelligent recommendation system

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102929928B (en) * 2012-09-21 2015-04-22 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method
CN103744918A (en) * 2013-12-27 2014-04-23 东软集团股份有限公司 Vertical domain based micro blog searching ranking method and system
CN105224699B (en) * 2015-11-17 2020-01-03 Tcl集团股份有限公司 News recommendation method and device
AU2017272253B2 (en) * 2016-12-07 2019-08-01 Tata Consultancy Services Limited System and method for context and sequence aware recommendation
CN107885886A (en) * 2017-12-07 2018-04-06 百度在线网络技术(北京)有限公司 To the method, apparatus and server of information recommendation sort result
CN108334575B (en) * 2018-01-23 2022-04-26 北京三快在线科技有限公司 Recommendation result sorting correction method and device and electronic equipment
CN111382349B (en) * 2018-12-29 2023-08-15 广州市百果园网络科技有限公司 Information recommendation method, device, computer equipment and storage medium
CN110377828B (en) * 2019-07-22 2023-05-26 腾讯科技(深圳)有限公司 Information recommendation method, device, server and storage medium
CN111368203A (en) * 2020-03-09 2020-07-03 电子科技大学 News recommendation method and system based on graph neural network
CN111428133A (en) * 2020-03-19 2020-07-17 腾讯科技(北京)有限公司 Artificial intelligence based recommendation method and device, electronic equipment and storage medium
CN111858915A (en) * 2020-08-07 2020-10-30 成都理工大学 Information recommendation method and system based on label similarity

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831234A (en) * 2012-08-31 2012-12-19 北京邮电大学 Personalized news recommendation device and method based on news content and theme feature
CN104166668A (en) * 2014-06-09 2014-11-26 南京邮电大学 News recommendation system and method based on FOLFM model
US20160055541A1 (en) * 2014-08-21 2016-02-25 Everyday Health Inc. Personalized recommendation system and methods using automatic identification of user preferences
CN107025310A (en) * 2017-05-17 2017-08-08 长春嘉诚信息技术股份有限公司 A kind of automatic news in real time recommends method
CN112231593A (en) * 2020-12-15 2021-01-15 上海朝阳永续信息技术股份有限公司 Financial information intelligent recommendation system

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116304128A (en) * 2023-03-01 2023-06-23 广西泛华于成信息科技有限公司 Multimedia information recommendation system based on big data
CN116304128B (en) * 2023-03-01 2023-12-15 微众梦想科技(北京)有限公司 Multimedia information recommendation system based on big data
CN116932920A (en) * 2023-09-18 2023-10-24 青岛理工大学 Accurate healthy science popularization data recommendation method based on big data
CN116932920B (en) * 2023-09-18 2023-12-12 青岛理工大学 Accurate healthy science popularization data recommendation method based on big data

Also Published As

Publication number Publication date
CN112231593A (en) 2021-01-15
CN112231593B (en) 2021-03-12

Similar Documents

Publication Publication Date Title
WO2022126873A1 (en) Intelligent financial information recommendation system
Subakti et al. The performance of BERT as data representation of text clustering
CN104834686B (en) A kind of video recommendation method based on mixing semantic matrix
CN105608477B (en) Method and system for matching portrait with job position
CN103870973B (en) Information push, searching method and the device of keyword extraction based on electronic information
CN109767318A (en) Loan product recommended method, device, equipment and storage medium
CN112434151A (en) Patent recommendation method and device, computer equipment and storage medium
CN112966091B (en) Knowledge map recommendation system fusing entity information and heat
US20170371965A1 (en) Method and system for dynamically personalizing profiles in a social network
CN105893609A (en) Mobile APP recommendation method based on weighted mixing
CN102306298B (en) Wiki-based dynamic evolution method of image classification system
CN112307762A (en) Search result sorting method and device, storage medium and electronic device
CN107357793A (en) Information recommendation method and device
WO2017107010A1 (en) Information analysis system and method based on event regression test
WO2020147259A1 (en) User portait method and apparatus, readable storage medium, and terminal device
CN103778206A (en) Method for providing network service resources
CN115630153A (en) Research student literature resource recommendation method based on big data technology
TWI828928B (en) Highly scalable, multi-label text classification methods and devices
CN110795613A (en) Commodity searching method, device and system and electronic equipment
CN116010552A (en) Engineering cost data analysis system and method based on keyword word library
CN116823321B (en) Method and system for analyzing economic management data of electric business
CN111859955A (en) Public opinion data analysis model based on deep learning
Cui et al. An online book recommendation system based on web service
CN114298020B (en) Keyword vectorization method based on topic semantic information and application thereof
CN1996280A (en) Method for co-building search engine

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21904813

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21904813

Country of ref document: EP

Kind code of ref document: A1