CN109325175A - Merge the news push method, device and equipment of microblogging interest digging - Google Patents

Merge the news push method, device and equipment of microblogging interest digging Download PDF

Info

Publication number
CN109325175A
CN109325175A CN201810966477.3A CN201810966477A CN109325175A CN 109325175 A CN109325175 A CN 109325175A CN 201810966477 A CN201810966477 A CN 201810966477A CN 109325175 A CN109325175 A CN 109325175A
Authority
CN
China
Prior art keywords
lexical item
text
news
item set
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810966477.3A
Other languages
Chinese (zh)
Inventor
张帅
陈靖宇
陈平华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201810966477.3A priority Critical patent/CN109325175A/en
Publication of CN109325175A publication Critical patent/CN109325175A/en
Pending legal-status Critical Current

Links

Abstract

The invention discloses a kind of news push methods for merging microblogging interest digging, news lexical item set can be determined according to newsletter archive, user interest lexical item set is determined according to microblogging state text, by calculating the similarity between news lexical item set and user interest lexical item set, decide whether newsletter archive being pushed to microblog users.It can be seen that, this process employs there are the characteristics that user interest abundant in this mainstream social application of microblogging, user interest lexical item set is determined according to microblogging state text, since microblogging state text includes at least user's dynamic text, subscriber data text and social information text, expand the text for extracting user interest, alleviates cold start-up problem, furthermore, what microblogging state text reflected is the interest of microblog users itself, is more able to satisfy individual demand.The present invention also provides a kind of news push device, equipment and computer readable storage medium for merging microblogging interest digging, effect is corresponded to the above method.

Description

Merge the news push method, device and equipment of microblogging interest digging
Technical field
The present invention relates to interest digging field, in particular to a kind of news push method for merging microblogging interest digging, dress It sets, equipment and computer readable storage medium.
Background technique
With the development of internet, information explosion makes news recommend to become the hot research of recommender system.Traditional is new It hears recommended method and mainly includes the following three types method:
News based on content is recommended, and main thought is to be recorded as user according to the browsing of user itself to recommend news, The disadvantage is that it is limited to the extractability of news features, it there is excessive specialization, exist on recommending diversity It is insufficient, it is difficult to excavate the potential interest of user.In addition, it will centainly require the quantity of the browsing record of user itself, work as user The browsing of itself records insufficient, and this method will be difficult for user and recommend interested news.
Collaborative filtering news is recommended, and it is use that main thought, which is according to the hobby of people similar with user's self-condition, News is recommended at family, the disadvantage is that being difficult to meet the individual demand of user.
The news of knowledge based (semanteme) is recommended, and it is that user recommends news that main thought, which is according to expert opinion, is lacked Point is the individual demand for not being able to satisfy user, in addition, its recommendation effect is stronger to the dependence of knowledge base, and specialized field Knowledge and the usually more difficult acquisition of inference rule.
As it can be seen that traditional news recommended method has cold start-up and is not able to satisfy users ' individualized requirement Problem.
Summary of the invention
The object of the present invention is to provide a kind of news push method, apparatus, equipment and calculating for merging microblogging interest digging Machine readable storage medium storing program for executing, to solve the problems, such as cold start-up existing for traditional news recommended method and be unable to satisfy user personality The problem of change demand.
In order to solve the above technical problems, the present invention provides a kind of news push method for merging microblogging interest digging, packet It includes:
A plurality of newsletter archive is obtained from news platform, determines news lexical item set corresponding with the newsletter archive, wherein The news lexical item set includes the news lexical item in the newsletter archive, further includes the word frequency of the news lexical item;
The microblogging state text of microblog users is obtained, determines user interest lexical item collection corresponding with the microblogging state text It closes, wherein the microblogging state text includes at least user's dynamic text, subscriber data text and social information text, institute Stating user interest lexical item set includes the interest lexical item in the microblogging state text, further includes the word frequency of the interest lexical item;
Calculate the similarity of the news lexical item set Yu the user interest lexical item set;
Judge whether the similarity is greater than preset threshold;
If more than the newsletter archive is then pushed to the microblog users.
Wherein, described to obtain a plurality of newsletter archive from news platform, determine news lexical item corresponding with the newsletter archive Set includes:
A plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, multiple newsletter archives are obtained Set;
Determine news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively;
It is described to calculate the news lexical item set and the similarity of the user interest lexical item set includes:
According to the user interest lexical item set, the microblog users sense is filtered out from each newsletter archive set The newsletter archive set of interest;
Each newsletter archive in the newsletter archive set is traversed, the corresponding news lexical item collection of the newsletter archive is calculated Close the similarity with the user interest lexical item set.
Wherein, the newsletter archive collection is combined into is classified by textCNN sorting technique.
Wherein, the calculation formula of the word frequency of the news lexical item are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i), wherein num (tij) is that news lexical item i exists The number occurred in newsletter archive j, total (Di) are the sum of the lexical item in newsletter archive j, and N is to include the new of newsletter archive j The sum of newsletter archive in text collection is heard, nij is the number of the newsletter archive in the newsletter archive set comprising news lexical item i Amount, D (i) are timeliness parameter.
Wherein, the microblogging state text for obtaining microblog users, determines user corresponding with the microblogging state text Interest lexical item set includes:
Obtain user's dynamic text, subscriber data text and the social information text of microblog users;
Determine user dynamic lexical item set I1 corresponding with user's dynamic text;
Determine subscriber data lexical item set I2 corresponding with the subscriber data text;
Determine social information lexical item set I3 corresponding with the social information text;
Determine user interest lexical item set Y=U*I1+V*I2+W*I3, wherein U is user's dynamic lexical item set I1 Default weight, V be the subscriber data lexical item set I2 default weight, W is the social information lexical item set I3 Default weight, and meet U+V+W=1.
Wherein, the calculation formula of the similarity of the news lexical item set and the user interest lexical item set are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L), wherein L is news lexical item set, and sim () is phase Like degree calculation formula, for determining the similarity between two lexical item sets according to lexical item and word frequency.
Wherein, user's dynamic text includes original content text, forwarding content text and comment content text;
Determination user dynamic lexical item set I1 corresponding with user's dynamic text include:
Determine first lexical item set I11 corresponding with the original content text;
Determine second lexical item set I12 corresponding with the forwarding content text;
Determine third lexical item set I13 corresponding with the comment text;
Determine user dynamic lexical item set I1=A*I1+B*I2+C*I3 corresponding with user's dynamic text, wherein A For the default weight of the original content text, B is the default weight of the forwarding content text, and C is the comment text Default weight.
Correspondingly, the present invention also provides a kind of news push devices for merging microblogging interest digging, comprising:
News lexical item set determining module: determining literary with the news for obtaining a plurality of newsletter archive from news platform This corresponding news lexical item set, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes The word frequency of the news lexical item;
User interest lexical item set determining module: for obtaining the microblogging state text of microblog users, it is determining with it is described micro- The rich corresponding user interest lexical item set of state text, wherein the microblogging state text includes at least user's dynamic text, uses Family data text and social information text, the user interest lexical item set include the interest in the microblogging state text Lexical item further includes the word frequency of the interest lexical item;
Similarity calculation module: similar to the user interest lexical item set for calculating the news lexical item set Degree;
Similarity judgment module: for judging whether the similarity is greater than preset threshold;
News push module: for the newsletter archive being pushed to described when the similarity is greater than preset threshold Microblog users.
In addition, the present invention also provides a kind of news push equipment for merging microblogging interest digging, comprising:
Memory: for storing computer program;
Processor: for executing the computer program, a kind of the new of fusion microblogging interest digging as described above is realized The step of hearing method for pushing.
Finally, being deposited on the computer readable storage medium the present invention also provides a kind of computer readable storage medium Computer program is contained, a kind of fusion microblogging interest digging as described above is realized when the computer program is executed by processor News push method the step of.
A kind of news push method merging microblogging interest digging provided by the present invention can obtain more from news platform Newsletter archive, and determine corresponding news lexical item set, moreover it is possible to the microblogging state text of microblog users is obtained, and determines and corresponds to User interest lexical item set, finally by the similarity calculated between news lexical item set and user interest lexical item set, certainly It is fixed whether newsletter archive to be pushed to microblog users.As it can be seen that this process employs there are rich in this mainstream social application of microblogging The characteristics of user interest of richness, user interest lexical item set is determined according to microblogging state text, and recommends news accordingly for user. Due to microblogging state text include at least three user's dynamic text, subscriber data text and social information text contents, one Determine to have expanded the text for extracting user interest in degree, alleviate cold start-up problem, in addition, because microblogging state text is equal Reflection is the interest of microblog users itself, therefore is more able to satisfy the individual demand of user.
It can in addition, the present invention also provides a kind of news push device, equipment and computers for merging microblogging interest digging Storage medium is read, effect corresponds to the above method, and which is not described herein again.
Detailed description of the invention
It, below will be to embodiment or existing for the clearer technical solution for illustrating the embodiment of the present invention or the prior art Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of implementation process of news push embodiment of the method for merging microblogging interest digging provided by the invention Figure;
Fig. 2 is the composition schematic diagram of microblogging state text provided by the invention;
Fig. 3 is that the whole of a kind of news push embodiment of the method for merging microblogging interest digging provided by the invention realizes frame Figure;
Fig. 4 is a kind of structural block diagram of news push Installation practice for merging microblogging interest digging provided by the invention.
Specific embodiment
Core of the invention is to provide a kind of news recommended method, device, equipment and calculating for merging microblogging interest digging Machine readable storage medium storing program for executing alleviates cold start-up problem, is more able to satisfy the individual demand of user.
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Traditional news recommended method carried out using user social contact media is by issuing the information such as content, comment from user In extract important keyword to construct the interest set of user, to provide Personalize News recommendation service.However, above-mentioned The information for the social networks network that user is not all accounted for when news recommendation is carried out in conjunction with social media and traditional news media.
Social media one of of the microblogging as mainstream, there are user interests abundant.Not only record text, the figure of user The historical datas such as piece and video, while also having the relationship of the social networks between such as follower and bean vermicelli user.
Then the characteristics of the present invention is based on social medias proposes a kind of news push side for merging microblogging interest digging Method, device, equipment and computer readable storage medium, the news for realizing fusion microblogging interest digging are recommended, and solve cold open The problems such as moving and being unable to satisfy users ' individualized requirement.So-called cold start-up problem is how in no a large number of users data In the case of design personalized recommender system and make user satisfied to recommendation results to be ready using recommender system.Experiment shows Recommended method recommendation efficiency with higher and preferable using effect.
Below to it is provided by the invention it is a kind of merge microblogging interest digging news recommended method embodiment be introduced, join See Fig. 1, this method embodiment specifically includes:
Step S101: a plurality of newsletter archive is obtained from news platform, determines news lexical item corresponding with the newsletter archive Set, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes the word of the news lexical item Frequently.
Here a plurality of newsletter archive got from news platform, can be by crawl acquisition, current time it Newsletter archive in one section of preceding preset time period, the quantity of newsletter archive can be up to ten thousand, even more.
In the present embodiment, for the convenience of description, can use VSM (vector space model) passes through lexical item-weight classics Algorithm is indicated to describe text, i.e. every microblogging state text or newsletter archive can carry out table using { K, a W } binary group Show, wherein K is the set of lexical item (text sentence is obtained by participle), and W is the word frequency (such as frequency of occurrence) of corresponding lexical item.
In the present embodiment, no matter newsletter archive or microblogging state text, in the process for being converted into corresponding lexical item set In be almost required to be segmented.For segmenting the realization of process, due to and non-present invention key point, the present embodiment do not open up Open description.
VSM everyday words frequency calculation method comments rate statistics, TF-IDF algorithm etc., can be calculated using TF-IDF in the present embodiment Method, specific formula is as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] (1)
Wherein, num (tij) is the number that news lexical item i occurs in newsletter archive j, and total (Di) is newsletter archive j In lexical item sum, N is the sum of newsletter archive in the newsletter archive set comprising newsletter archive j, and nij is news text The quantity of newsletter archive in this set comprising news lexical item i.
In the present embodiment, it is contemplated that news and microblogging have very strong timeliness, therefore, can be in word frequency calculation formula Timeliness parameter is introduced, the formula of timeliness parameter specifically can be such that
D (i)=[- (tp-tn) 2/tB+1] (2)
Wherein, tp is current time, and tn is the news time, and tB is benchmark timestamp.
So, the TF-IDF algorithmic formula introduced after timeliness parameter can be with are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i) (3)
Step S102: obtaining the microblogging state text of microblog users, determines user corresponding with the microblogging state text Interest lexical item set, wherein the microblogging state text includes at least user's dynamic text, subscriber data text and social activity Information text, the user interest lexical item set include the interest lexical item in the microblogging state text, further include the interest The word frequency of lexical item.
In the present embodiment, the interest of microblog users is excavated according to the microblogging state text of microblog users, wherein in order to protect Card can have enough texts to excavate user interest, as shown in Fig. 2, microblogging state text includes at least: user's dynamic text is used Family data text and three, social information text.
It so, can first basis during determining corresponding user interest lexical item set according to microblogging state text User's dynamic text, subscriber data text, social information text determine user's dynamic lexical item set, subscriber data lexical item collection respectively It closes, social information lexical item set, is then determining final user interest lexical item set according to above three set respectively.Herein In the process, it is contemplated that user's dynamic text, subscriber data text, social information text are able to reflect the degree of microblog users interest Difference can be respectively that its corresponding lexical item set distributes suitable weight, to reach so that final user interest lexical item collection Close the purpose that can utmostly reflect user interest.
That is, the microblogging state text of above-mentioned acquisition microblog users, determination is corresponding with the microblogging state text User interest lexical item set can with specifically includes the following steps:
Step S1021: the user's dynamic text, subscriber data text and social information text of microblog users are obtained.
Step S1022: user dynamic lexical item set I1 corresponding with user's dynamic text is determined.
Step S1023: subscriber data lexical item set I2 corresponding with the subscriber data text is determined;
Step S1024: social information lexical item set I3 corresponding with the social information text is determined.
Step S1025: user interest lexical item set is determined:
Y=U*I1+V*I2+W*I3 (4)
Wherein, U is the default weight of user's dynamic lexical item set I1, and V is the subscriber data lexical item set I2's Default weight, W is the default weight of the social information lexical item set I3, and meets U+V+W=1.
As shown in Fig. 2, specifically, user's dynamic text may include the original content text of microblog users, forwarding content Text, comment text, subscriber data text may include user profile text, user's occupation text, user interest label text, Social information text may include the relevant information text of the follower of the microblog users and the relevant information text of bean vermicelli.
It is similar to the above process, it is contemplated that the degree that different texts are able to reflect user interest is different, can be respectively each A text distributes suitable weight, for example, user's dynamic text includes original content text, forwarding content text and comment Content text, then, above-mentioned steps S1021, that is, determination user's dynamic word corresponding with user's dynamic text The process of item set I1 can specifically include step:
Step S10211: first lexical item set I11 corresponding with the original content text is determined.
Step S10212: second lexical item set I12 corresponding with the forwarding content text is determined.
Step S10213: third lexical item set I13 corresponding with the comment text is determined.
Step S10214: user's dynamic lexical item set corresponding with user's dynamic text is determined:
I1=A*I1+B*I2+C*I3 (5)
Wherein, A is the default weight of the original content text, and B is the default weight of the forwarding content text, and C is The default weight of the comment text.
Step S103: the similarity of the news lexical item set Yu the user interest lexical item set is calculated.
Specifically, the calculation formula of the similarity of the news lexical item set and the user interest lexical item set can be with Are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L) (6)
Wherein, L is news lexical item set, and sim () is calculating formula of similarity, and sim () is mainly used for according to lexical item and word Frequency determines the similarity between two lexical item sets.
Step S104: judge whether the similarity is greater than preset threshold.
Step S105: if more than the newsletter archive is then pushed to the microblog users.
It should be noted that the present embodiment, which does not limit the newsletter archive, is pushed to which kind of application, webpage or platform, tool Body, it can be pushed to microblogging application, other application associated with microblogging or webpage etc. can also be pushed to.
Herein on basis, the news crawled in news platform can be classified in advance as a preferred method, In this way when whether judge newsletter archive is the interested newsletter archive of microblog users, the news under which classification can be first judged Text may be that microblog users are interested, only need to be traversed for a kind of or a few class newsletter archives therein in this way, and not have to traversal Whole newsletter archives, greatly improves recommendation efficiency.
And the sorting technique used in categorization module is as shown in Fig. 2: carrying out news point using textCNN sorting technique Class.Secondly in microblogging interest digging module, as shown in Fig. 3:
That is, for step S101, that is, it is described obtain a plurality of newsletter archive from news platform, it is determining with it is described The process of the corresponding news lexical item set of newsletter archive, can specifically include:
Step S1011: a plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, is obtained more A newsletter archive set.
Specifically, the newsletter archive collection is combined into and is classified by textCNN sorting technique.For utilizing The process that textCNN classifies to newsletter archive, main includes determining article matrix, convolutional layer, pond layer and output layer Four parts, due to realizing the emphasis of process and non-present invention, here not reinflated description.
Step S1012: news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively is determined.
Correspondingly, step S103, that is, the calculating news lexical item set and the user interest lexical item set Similarity process, can specifically include:
Step S1031: it according to the user interest lexical item set, is filtered out from each newsletter archive set described The interested newsletter archive set of microblog users.
Step S1032: each newsletter archive in the newsletter archive set is traversed, it is corresponding to calculate the newsletter archive The similarity of news lexical item set and the user interest lexical item set.
Therefore, the whole realization process of the present embodiment is visible referring to Fig. 3, firstly, classify to a large amount of newsletter archive, Multiple newsletter archive set are obtained, then determine the corresponding news lexical item set of each newsletter archive in newsletter archive set; User's dynamic lexical item set, Yong Huzi are determined according to user's dynamic text, subscriber data text, social information text respectively simultaneously Expect lexical item set, social information lexical item set;Finally filter out user may interested newsletter archive set, and further from User is filtered out in newsletter archive set may interested newsletter archive.
To sum up, a kind of news push method merging microblogging interest digging provided by the present embodiment, can be flat from news Platform obtains a plurality of newsletter archive, and determines corresponding news lexical item set, moreover it is possible to the microblogging state text of microblog users is obtained, and Corresponding user interest lexical item set is determined, finally by the phase calculated between news lexical item set and user interest lexical item set Like degree, decide whether newsletter archive being pushed to microblog users.As it can be seen that this process employs in this mainstream social application of microblogging There are the characteristics that user interest abundant, user interest lexical item set is determined according to microblogging state text, and push away accordingly for user Recommend news.Since microblogging state text is including at least user's dynamic text, subscriber data text and three, social information text Content has expanded the text for extracting user interest to a certain extent, alleviates cold start-up problem, in addition, because microblogging shape What state text reflected is the interest of microblog users itself, therefore is more able to satisfy the individual demand of user.
Below to a kind of news push Installation practice progress for merging microblogging interest digging provided in an embodiment of the present invention It introduces, a kind of news push device merging microblogging interest digging described below a kind of merges microblogging interest with above-described The news push method of excavation can correspond to each other reference.
Referring to fig. 4, which specifically includes:
News lexical item set determining module 401: for obtaining a plurality of newsletter archive, the determining and news from news platform The corresponding news lexical item set of text, wherein the news lexical item set includes the news lexical item in the newsletter archive, is also wrapped Include the word frequency of the news lexical item.
User interest lexical item set determining module 402: for obtaining the microblogging state text of microblog users, it is determining with it is described The corresponding user interest lexical item set of microblogging state text, wherein the microblogging state text include at least user's dynamic text, Subscriber data text and social information text, the user interest lexical item set include emerging in the microblogging state text Interesting lexical item further includes the word frequency of the interest lexical item.
Similarity calculation module 403: for calculating the phase of the news lexical item set with the user interest lexical item set Like degree.
Similarity judgment module 404: for judging whether the similarity is greater than preset threshold.
News push module 405: for when the similarity is greater than preset threshold, the newsletter archive to be pushed to institute State microblog users.
A kind of news push device of fusion microblogging interest digging of the present embodiment is micro- for realizing a kind of fusion above-mentioned The news push method of rich interest digging, therefore the visible fusion microblogging one of above of specific embodiment in the device is emerging The embodiment part for the news push method that interest is excavated, for example, news lexical item set determining module 401, user interest lexical item collection Determining module 402, similarity calculation module 403, similarity judgment module 404, news push module 405 are closed, reality is respectively used to Step S101, S102, S103, S104, S105 in a kind of now above-mentioned news push method for merging microblogging interest digging.So Its specific embodiment is referred to the description of corresponding various pieces embodiment, herein not reinflated introduction.
In addition, due to the present embodiment a kind of fusion microblogging interest digging news push device for realizing above-mentioned one The news push method of kind fusion microblogging interest digging, therefore its effect is corresponding with the effect of the above method, it is no longer superfluous here It states.
In addition, the present invention also provides a kind of news push equipment for merging microblogging interest digging, comprising:
Memory: for storing computer program;
Processor: it for executing the computer program, realizes weighed a kind of fusion microblogging interest digging as described in going up The step of news push method.
Finally, being deposited on the computer readable storage medium the present invention also provides a kind of computer readable storage medium Computer program is contained, a kind of fusion microblogging interest digging as described above is realized when the computer program is executed by processor News push method the step of.
Due to the news push equipment and computer readable storage medium of a kind of fusion microblogging interest digging of the present embodiment For realizing a kind of news push method for merging microblogging interest digging above-mentioned, therefore its specific implementation may refer to The description of embodiment of the method is stated, here not reinflated introduction, furthermore, it is to be understood that its effect is corresponding with the effect of the above method, this In also repeat no more.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part Explanation.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Above to a kind of method, apparatus of news push for merging microblogging interest digging provided by the present invention, equipment with And computer readable storage medium is described in detail.Specific case used herein is to the principle of the present invention and embodiment party Formula is expounded, and the above description of the embodiment is only used to help understand the method for the present invention and its core ideas.It should refer to It out, for those skilled in the art, without departing from the principle of the present invention, can also be to the present invention Some improvement and modification can also be carried out, and these improvements and modifications also fall within the scope of protection of the claims of the present invention.

Claims (10)

1. a kind of news push method for merging microblogging interest digging characterized by comprising
A plurality of newsletter archive is obtained from news platform, determines news lexical item set corresponding with the newsletter archive, wherein described News lexical item set includes the news lexical item in the newsletter archive, further includes the word frequency of the news lexical item;
The microblogging state text of microblog users is obtained, determines user interest lexical item set corresponding with the microblogging state text, Wherein, the microblogging state text includes at least user's dynamic text, subscriber data text and social information text, described User interest lexical item set includes the interest lexical item in the microblogging state text, further includes the word frequency of the interest lexical item;
Calculate the similarity of the news lexical item set Yu the user interest lexical item set;
Judge whether the similarity is greater than preset threshold;
If more than the newsletter archive is then pushed to the microblog users.
2. the method as described in claim 1, which is characterized in that it is described to obtain a plurality of newsletter archive from news platform, determine with The corresponding news lexical item set of the newsletter archive includes:
A plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, multiple newsletter archive set are obtained;
Determine news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively;
It is described to calculate the news lexical item set and the similarity of the user interest lexical item set includes:
According to the user interest lexical item set, it is interested from each newsletter archive set to filter out the microblog users Newsletter archive set;
Traverse each newsletter archive in the newsletter archive set, calculate the corresponding news lexical item set of the newsletter archive with The similarity of the user interest lexical item set.
3. method according to claim 2, which is characterized in that the newsletter archive collection is combined into through textCNN sorting technique Classified.
4. method according to claim 2, which is characterized in that the calculation formula of the word frequency of the news lexical item are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i), wherein num (tij) is news lexical item i in news The number occurred in text j, total (Di) are the sum of the lexical item in newsletter archive j, and N is the news text comprising newsletter archive j The sum of newsletter archive in this set, nij are the quantity of the newsletter archive in the newsletter archive set comprising news lexical item i, D It (i) is timeliness parameter.
5. the method as described in claim 1, which is characterized in that it is described obtain microblog users microblogging state text, determine with The corresponding user interest lexical item set of the microblogging state text includes:
Obtain user's dynamic text, subscriber data text and the social information text of microblog users;
Determine user dynamic lexical item set I1 corresponding with user's dynamic text;
Determine subscriber data lexical item set I2 corresponding with the subscriber data text;
Determine social information lexical item set I3 corresponding with the social information text;
Determine user interest lexical item set Y=U*I1+V*I2+W*I3, wherein U is the pre- of user's dynamic lexical item set I1 If weight, V is the default weight of the subscriber data lexical item set I2, and W is the default of the social information lexical item set I3 Weight, and meet U+V+W=1.
6. method as claimed in claim 5, which is characterized in that the news lexical item set and the user interest lexical item set Similarity calculation formula are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L), wherein L is news lexical item set, and sim () is similarity Calculation formula, for determining the similarity between two lexical item sets according to lexical item and word frequency.
7. method as claimed in claim 5, which is characterized in that user's dynamic text includes original content text, forwarding Content text and comment content text;
Determination user dynamic lexical item set I1 corresponding with user's dynamic text include:
Determine first lexical item set I11 corresponding with the original content text;
Determine second lexical item set I12 corresponding with the forwarding content text;
Determine third lexical item set I13 corresponding with the comment text;
Determine user dynamic lexical item set I1=A*I1+B*I2+C*I3 corresponding with user's dynamic text, wherein A is institute The default weight of original content text is stated, B is the default weight of the forwarding content text, and C is the default of the comment text Weight.
8. a kind of news push device for merging microblogging interest digging characterized by comprising
News lexical item set determining module: for obtaining a plurality of newsletter archive, the determining and newsletter archive pair from news platform The news lexical item set answered, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes described The word frequency of news lexical item;
User interest lexical item set determining module: for obtaining the microblogging state text of microblog users, the determining and microblogging shape The corresponding user interest lexical item set of state text, wherein the microblogging state text includes at least user's dynamic text, Yong Huzi Expect text and social information text, the user interest lexical item set include the interest lexical item in the microblogging state text, It further include the word frequency of the interest lexical item;
Similarity calculation module: for calculating the similarity of the news lexical item set Yu the user interest lexical item set;
Similarity judgment module: for judging whether the similarity is greater than preset threshold;
News push module: for when the similarity is greater than preset threshold, the newsletter archive to be pushed to the microblogging User.
9. a kind of news push equipment for merging microblogging interest digging characterized by comprising
Memory: for storing computer program;
Processor: for executing the computer program, a kind of fusion microblogging as described in claim 1-7 any one is realized The step of news push method of interest digging.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium Program realizes a kind of fusion microblogging as described in claim 1-7 any one when the computer program is executed by processor The step of news push method of interest digging.
CN201810966477.3A 2018-08-23 2018-08-23 Merge the news push method, device and equipment of microblogging interest digging Pending CN109325175A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810966477.3A CN109325175A (en) 2018-08-23 2018-08-23 Merge the news push method, device and equipment of microblogging interest digging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810966477.3A CN109325175A (en) 2018-08-23 2018-08-23 Merge the news push method, device and equipment of microblogging interest digging

Publications (1)

Publication Number Publication Date
CN109325175A true CN109325175A (en) 2019-02-12

Family

ID=65264459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810966477.3A Pending CN109325175A (en) 2018-08-23 2018-08-23 Merge the news push method, device and equipment of microblogging interest digging

Country Status (1)

Country Link
CN (1) CN109325175A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182351A (en) * 2020-09-28 2021-01-05 哈尔滨工业大学(深圳) News recommendation method and device based on multi-feature fusion
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572797A (en) * 2014-05-12 2015-04-29 深圳市智搜信息技术有限公司 Individual service recommendation system and method based on topic model
CN105868267A (en) * 2016-03-04 2016-08-17 江苏工程职业技术学院 Modeling method for mobile social network user interests
CN107025310A (en) * 2017-05-17 2017-08-08 长春嘉诚信息技术股份有限公司 A kind of automatic news in real time recommends method
CN107766576A (en) * 2017-11-15 2018-03-06 北京航空航天大学 A kind of extracting method of microblog users interest characteristics

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104572797A (en) * 2014-05-12 2015-04-29 深圳市智搜信息技术有限公司 Individual service recommendation system and method based on topic model
CN105868267A (en) * 2016-03-04 2016-08-17 江苏工程职业技术学院 Modeling method for mobile social network user interests
CN107025310A (en) * 2017-05-17 2017-08-08 长春嘉诚信息技术股份有限公司 A kind of automatic news in real time recommends method
CN107766576A (en) * 2017-11-15 2018-03-06 北京航空航天大学 A kind of extracting method of microblog users interest characteristics

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112182351A (en) * 2020-09-28 2021-01-05 哈尔滨工业大学(深圳) News recommendation method and device based on multi-feature fusion
CN112749341A (en) * 2021-01-22 2021-05-04 南京莱斯网信技术研究院有限公司 Key public opinion recommendation method, readable storage medium and data processing device
CN112749341B (en) * 2021-01-22 2024-03-29 南京莱斯网信技术研究院有限公司 Important public opinion recommendation method, readable storage medium and data processing device

Similar Documents

Publication Publication Date Title
Zhou et al. The state-of-the-art in personalized recommender systems for social networking
Manovich Digital traces in context| 100 billion data rows per second: Media analytics in the early 21st century
CN103778148B (en) Life cycle management method and equipment for data file of Hadoop distributed file system
Sjöberg et al. Digital me: Controlling and making sense of my digital footprint
CN104484431B (en) A kind of multi-source Personalize News webpage recommending method based on domain body
CN103810162B (en) The method and system of recommendation network information
WO2015058309A1 (en) Systems and methods for determining influencers in a social data network
CN102760128A (en) Telecommunication field package recommending method based on intelligent customer service robot interaction
CN103207917B (en) The method of mark content of multimedia, the method and system of generation content recommendation
CN103473036B (en) A kind of input method skin method for pushing and system
CN104077415A (en) Searching method and device
CN105787025A (en) Network platform public account classifying method and device
CN110888990A (en) Text recommendation method, device, equipment and medium
CN108874722A (en) A kind of electronic-book reading system
CN108765052A (en) Electric business recommendation/method for pushing and device, storage medium and computing device
CN107977420A (en) The abstract extraction method, apparatus and readable storage medium storing program for executing of a kind of evolved document
CN109325175A (en) Merge the news push method, device and equipment of microblogging interest digging
CN108733669A (en) A kind of personalized digital media content recommendation system and method based on term vector
CN108153781A (en) The method and apparatus for extracting the keyword of business scope
CN103595747A (en) User-information recommending method and system
CN106792616A (en) Mobile terminal user's surfing flow analysis method and system
CN104462061A (en) Word extraction method and word extraction device
CN103942213A (en) Data paging method and device
CN109214856A (en) The method for digging and device, computer equipment and readable medium that user is intended to
CN107861993A (en) A kind of data processing method and device for running application program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190212

RJ01 Rejection of invention patent application after publication