CN109325175A - Merge the news push method, device and equipment of microblogging interest digging - Google Patents
Merge the news push method, device and equipment of microblogging interest digging Download PDFInfo
- Publication number
- CN109325175A CN109325175A CN201810966477.3A CN201810966477A CN109325175A CN 109325175 A CN109325175 A CN 109325175A CN 201810966477 A CN201810966477 A CN 201810966477A CN 109325175 A CN109325175 A CN 109325175A
- Authority
- CN
- China
- Prior art keywords
- lexical item
- text
- news
- item set
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a kind of news push methods for merging microblogging interest digging, news lexical item set can be determined according to newsletter archive, user interest lexical item set is determined according to microblogging state text, by calculating the similarity between news lexical item set and user interest lexical item set, decide whether newsletter archive being pushed to microblog users.It can be seen that, this process employs there are the characteristics that user interest abundant in this mainstream social application of microblogging, user interest lexical item set is determined according to microblogging state text, since microblogging state text includes at least user's dynamic text, subscriber data text and social information text, expand the text for extracting user interest, alleviates cold start-up problem, furthermore, what microblogging state text reflected is the interest of microblog users itself, is more able to satisfy individual demand.The present invention also provides a kind of news push device, equipment and computer readable storage medium for merging microblogging interest digging, effect is corresponded to the above method.
Description
Technical field
The present invention relates to interest digging field, in particular to a kind of news push method for merging microblogging interest digging, dress
It sets, equipment and computer readable storage medium.
Background technique
With the development of internet, information explosion makes news recommend to become the hot research of recommender system.Traditional is new
It hears recommended method and mainly includes the following three types method:
News based on content is recommended, and main thought is to be recorded as user according to the browsing of user itself to recommend news,
The disadvantage is that it is limited to the extractability of news features, it there is excessive specialization, exist on recommending diversity
It is insufficient, it is difficult to excavate the potential interest of user.In addition, it will centainly require the quantity of the browsing record of user itself, work as user
The browsing of itself records insufficient, and this method will be difficult for user and recommend interested news.
Collaborative filtering news is recommended, and it is use that main thought, which is according to the hobby of people similar with user's self-condition,
News is recommended at family, the disadvantage is that being difficult to meet the individual demand of user.
The news of knowledge based (semanteme) is recommended, and it is that user recommends news that main thought, which is according to expert opinion, is lacked
Point is the individual demand for not being able to satisfy user, in addition, its recommendation effect is stronger to the dependence of knowledge base, and specialized field
Knowledge and the usually more difficult acquisition of inference rule.
As it can be seen that traditional news recommended method has cold start-up and is not able to satisfy users ' individualized requirement
Problem.
Summary of the invention
The object of the present invention is to provide a kind of news push method, apparatus, equipment and calculating for merging microblogging interest digging
Machine readable storage medium storing program for executing, to solve the problems, such as cold start-up existing for traditional news recommended method and be unable to satisfy user personality
The problem of change demand.
In order to solve the above technical problems, the present invention provides a kind of news push method for merging microblogging interest digging, packet
It includes:
A plurality of newsletter archive is obtained from news platform, determines news lexical item set corresponding with the newsletter archive, wherein
The news lexical item set includes the news lexical item in the newsletter archive, further includes the word frequency of the news lexical item;
The microblogging state text of microblog users is obtained, determines user interest lexical item collection corresponding with the microblogging state text
It closes, wherein the microblogging state text includes at least user's dynamic text, subscriber data text and social information text, institute
Stating user interest lexical item set includes the interest lexical item in the microblogging state text, further includes the word frequency of the interest lexical item;
Calculate the similarity of the news lexical item set Yu the user interest lexical item set;
Judge whether the similarity is greater than preset threshold;
If more than the newsletter archive is then pushed to the microblog users.
Wherein, described to obtain a plurality of newsletter archive from news platform, determine news lexical item corresponding with the newsletter archive
Set includes:
A plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, multiple newsletter archives are obtained
Set;
Determine news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively;
It is described to calculate the news lexical item set and the similarity of the user interest lexical item set includes:
According to the user interest lexical item set, the microblog users sense is filtered out from each newsletter archive set
The newsletter archive set of interest;
Each newsletter archive in the newsletter archive set is traversed, the corresponding news lexical item collection of the newsletter archive is calculated
Close the similarity with the user interest lexical item set.
Wherein, the newsletter archive collection is combined into is classified by textCNN sorting technique.
Wherein, the calculation formula of the word frequency of the news lexical item are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i), wherein num (tij) is that news lexical item i exists
The number occurred in newsletter archive j, total (Di) are the sum of the lexical item in newsletter archive j, and N is to include the new of newsletter archive j
The sum of newsletter archive in text collection is heard, nij is the number of the newsletter archive in the newsletter archive set comprising news lexical item i
Amount, D (i) are timeliness parameter.
Wherein, the microblogging state text for obtaining microblog users, determines user corresponding with the microblogging state text
Interest lexical item set includes:
Obtain user's dynamic text, subscriber data text and the social information text of microblog users;
Determine user dynamic lexical item set I1 corresponding with user's dynamic text;
Determine subscriber data lexical item set I2 corresponding with the subscriber data text;
Determine social information lexical item set I3 corresponding with the social information text;
Determine user interest lexical item set Y=U*I1+V*I2+W*I3, wherein U is user's dynamic lexical item set I1
Default weight, V be the subscriber data lexical item set I2 default weight, W is the social information lexical item set I3
Default weight, and meet U+V+W=1.
Wherein, the calculation formula of the similarity of the news lexical item set and the user interest lexical item set are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L), wherein L is news lexical item set, and sim () is phase
Like degree calculation formula, for determining the similarity between two lexical item sets according to lexical item and word frequency.
Wherein, user's dynamic text includes original content text, forwarding content text and comment content text;
Determination user dynamic lexical item set I1 corresponding with user's dynamic text include:
Determine first lexical item set I11 corresponding with the original content text;
Determine second lexical item set I12 corresponding with the forwarding content text;
Determine third lexical item set I13 corresponding with the comment text;
Determine user dynamic lexical item set I1=A*I1+B*I2+C*I3 corresponding with user's dynamic text, wherein A
For the default weight of the original content text, B is the default weight of the forwarding content text, and C is the comment text
Default weight.
Correspondingly, the present invention also provides a kind of news push devices for merging microblogging interest digging, comprising:
News lexical item set determining module: determining literary with the news for obtaining a plurality of newsletter archive from news platform
This corresponding news lexical item set, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes
The word frequency of the news lexical item;
User interest lexical item set determining module: for obtaining the microblogging state text of microblog users, it is determining with it is described micro-
The rich corresponding user interest lexical item set of state text, wherein the microblogging state text includes at least user's dynamic text, uses
Family data text and social information text, the user interest lexical item set include the interest in the microblogging state text
Lexical item further includes the word frequency of the interest lexical item;
Similarity calculation module: similar to the user interest lexical item set for calculating the news lexical item set
Degree;
Similarity judgment module: for judging whether the similarity is greater than preset threshold;
News push module: for the newsletter archive being pushed to described when the similarity is greater than preset threshold
Microblog users.
In addition, the present invention also provides a kind of news push equipment for merging microblogging interest digging, comprising:
Memory: for storing computer program;
Processor: for executing the computer program, a kind of the new of fusion microblogging interest digging as described above is realized
The step of hearing method for pushing.
Finally, being deposited on the computer readable storage medium the present invention also provides a kind of computer readable storage medium
Computer program is contained, a kind of fusion microblogging interest digging as described above is realized when the computer program is executed by processor
News push method the step of.
A kind of news push method merging microblogging interest digging provided by the present invention can obtain more from news platform
Newsletter archive, and determine corresponding news lexical item set, moreover it is possible to the microblogging state text of microblog users is obtained, and determines and corresponds to
User interest lexical item set, finally by the similarity calculated between news lexical item set and user interest lexical item set, certainly
It is fixed whether newsletter archive to be pushed to microblog users.As it can be seen that this process employs there are rich in this mainstream social application of microblogging
The characteristics of user interest of richness, user interest lexical item set is determined according to microblogging state text, and recommends news accordingly for user.
Due to microblogging state text include at least three user's dynamic text, subscriber data text and social information text contents, one
Determine to have expanded the text for extracting user interest in degree, alleviate cold start-up problem, in addition, because microblogging state text is equal
Reflection is the interest of microblog users itself, therefore is more able to satisfy the individual demand of user.
It can in addition, the present invention also provides a kind of news push device, equipment and computers for merging microblogging interest digging
Storage medium is read, effect corresponds to the above method, and which is not described herein again.
Detailed description of the invention
It, below will be to embodiment or existing for the clearer technical solution for illustrating the embodiment of the present invention or the prior art
Attached drawing needed in technical description is briefly described, it should be apparent that, the accompanying drawings in the following description is only this hair
Bright some embodiments for those of ordinary skill in the art without creative efforts, can be with root
Other attached drawings are obtained according to these attached drawings.
Fig. 1 is a kind of implementation process of news push embodiment of the method for merging microblogging interest digging provided by the invention
Figure;
Fig. 2 is the composition schematic diagram of microblogging state text provided by the invention;
Fig. 3 is that the whole of a kind of news push embodiment of the method for merging microblogging interest digging provided by the invention realizes frame
Figure;
Fig. 4 is a kind of structural block diagram of news push Installation practice for merging microblogging interest digging provided by the invention.
Specific embodiment
Core of the invention is to provide a kind of news recommended method, device, equipment and calculating for merging microblogging interest digging
Machine readable storage medium storing program for executing alleviates cold start-up problem, is more able to satisfy the individual demand of user.
In order to enable those skilled in the art to better understand the solution of the present invention, with reference to the accompanying drawings and detailed description
The present invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than
Whole embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not making creative work premise
Under every other embodiment obtained, shall fall within the protection scope of the present invention.
Traditional news recommended method carried out using user social contact media is by issuing the information such as content, comment from user
In extract important keyword to construct the interest set of user, to provide Personalize News recommendation service.However, above-mentioned
The information for the social networks network that user is not all accounted for when news recommendation is carried out in conjunction with social media and traditional news media.
Social media one of of the microblogging as mainstream, there are user interests abundant.Not only record text, the figure of user
The historical datas such as piece and video, while also having the relationship of the social networks between such as follower and bean vermicelli user.
Then the characteristics of the present invention is based on social medias proposes a kind of news push side for merging microblogging interest digging
Method, device, equipment and computer readable storage medium, the news for realizing fusion microblogging interest digging are recommended, and solve cold open
The problems such as moving and being unable to satisfy users ' individualized requirement.So-called cold start-up problem is how in no a large number of users data
In the case of design personalized recommender system and make user satisfied to recommendation results to be ready using recommender system.Experiment shows
Recommended method recommendation efficiency with higher and preferable using effect.
Below to it is provided by the invention it is a kind of merge microblogging interest digging news recommended method embodiment be introduced, join
See Fig. 1, this method embodiment specifically includes:
Step S101: a plurality of newsletter archive is obtained from news platform, determines news lexical item corresponding with the newsletter archive
Set, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes the word of the news lexical item
Frequently.
Here a plurality of newsletter archive got from news platform, can be by crawl acquisition, current time it
Newsletter archive in one section of preceding preset time period, the quantity of newsletter archive can be up to ten thousand, even more.
In the present embodiment, for the convenience of description, can use VSM (vector space model) passes through lexical item-weight classics
Algorithm is indicated to describe text, i.e. every microblogging state text or newsletter archive can carry out table using { K, a W } binary group
Show, wherein K is the set of lexical item (text sentence is obtained by participle), and W is the word frequency (such as frequency of occurrence) of corresponding lexical item.
In the present embodiment, no matter newsletter archive or microblogging state text, in the process for being converted into corresponding lexical item set
In be almost required to be segmented.For segmenting the realization of process, due to and non-present invention key point, the present embodiment do not open up
Open description.
VSM everyday words frequency calculation method comments rate statistics, TF-IDF algorithm etc., can be calculated using TF-IDF in the present embodiment
Method, specific formula is as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] (1)
Wherein, num (tij) is the number that news lexical item i occurs in newsletter archive j, and total (Di) is newsletter archive j
In lexical item sum, N is the sum of newsletter archive in the newsletter archive set comprising newsletter archive j, and nij is news text
The quantity of newsletter archive in this set comprising news lexical item i.
In the present embodiment, it is contemplated that news and microblogging have very strong timeliness, therefore, can be in word frequency calculation formula
Timeliness parameter is introduced, the formula of timeliness parameter specifically can be such that
D (i)=[- (tp-tn) 2/tB+1] (2)
Wherein, tp is current time, and tn is the news time, and tB is benchmark timestamp.
So, the TF-IDF algorithmic formula introduced after timeliness parameter can be with are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i) (3)
Step S102: obtaining the microblogging state text of microblog users, determines user corresponding with the microblogging state text
Interest lexical item set, wherein the microblogging state text includes at least user's dynamic text, subscriber data text and social activity
Information text, the user interest lexical item set include the interest lexical item in the microblogging state text, further include the interest
The word frequency of lexical item.
In the present embodiment, the interest of microblog users is excavated according to the microblogging state text of microblog users, wherein in order to protect
Card can have enough texts to excavate user interest, as shown in Fig. 2, microblogging state text includes at least: user's dynamic text is used
Family data text and three, social information text.
It so, can first basis during determining corresponding user interest lexical item set according to microblogging state text
User's dynamic text, subscriber data text, social information text determine user's dynamic lexical item set, subscriber data lexical item collection respectively
It closes, social information lexical item set, is then determining final user interest lexical item set according to above three set respectively.Herein
In the process, it is contemplated that user's dynamic text, subscriber data text, social information text are able to reflect the degree of microblog users interest
Difference can be respectively that its corresponding lexical item set distributes suitable weight, to reach so that final user interest lexical item collection
Close the purpose that can utmostly reflect user interest.
That is, the microblogging state text of above-mentioned acquisition microblog users, determination is corresponding with the microblogging state text
User interest lexical item set can with specifically includes the following steps:
Step S1021: the user's dynamic text, subscriber data text and social information text of microblog users are obtained.
Step S1022: user dynamic lexical item set I1 corresponding with user's dynamic text is determined.
Step S1023: subscriber data lexical item set I2 corresponding with the subscriber data text is determined;
Step S1024: social information lexical item set I3 corresponding with the social information text is determined.
Step S1025: user interest lexical item set is determined:
Y=U*I1+V*I2+W*I3 (4)
Wherein, U is the default weight of user's dynamic lexical item set I1, and V is the subscriber data lexical item set I2's
Default weight, W is the default weight of the social information lexical item set I3, and meets U+V+W=1.
As shown in Fig. 2, specifically, user's dynamic text may include the original content text of microblog users, forwarding content
Text, comment text, subscriber data text may include user profile text, user's occupation text, user interest label text,
Social information text may include the relevant information text of the follower of the microblog users and the relevant information text of bean vermicelli.
It is similar to the above process, it is contemplated that the degree that different texts are able to reflect user interest is different, can be respectively each
A text distributes suitable weight, for example, user's dynamic text includes original content text, forwarding content text and comment
Content text, then, above-mentioned steps S1021, that is, determination user's dynamic word corresponding with user's dynamic text
The process of item set I1 can specifically include step:
Step S10211: first lexical item set I11 corresponding with the original content text is determined.
Step S10212: second lexical item set I12 corresponding with the forwarding content text is determined.
Step S10213: third lexical item set I13 corresponding with the comment text is determined.
Step S10214: user's dynamic lexical item set corresponding with user's dynamic text is determined:
I1=A*I1+B*I2+C*I3 (5)
Wherein, A is the default weight of the original content text, and B is the default weight of the forwarding content text, and C is
The default weight of the comment text.
Step S103: the similarity of the news lexical item set Yu the user interest lexical item set is calculated.
Specifically, the calculation formula of the similarity of the news lexical item set and the user interest lexical item set can be with
Are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L) (6)
Wherein, L is news lexical item set, and sim () is calculating formula of similarity, and sim () is mainly used for according to lexical item and word
Frequency determines the similarity between two lexical item sets.
Step S104: judge whether the similarity is greater than preset threshold.
Step S105: if more than the newsletter archive is then pushed to the microblog users.
It should be noted that the present embodiment, which does not limit the newsletter archive, is pushed to which kind of application, webpage or platform, tool
Body, it can be pushed to microblogging application, other application associated with microblogging or webpage etc. can also be pushed to.
Herein on basis, the news crawled in news platform can be classified in advance as a preferred method,
In this way when whether judge newsletter archive is the interested newsletter archive of microblog users, the news under which classification can be first judged
Text may be that microblog users are interested, only need to be traversed for a kind of or a few class newsletter archives therein in this way, and not have to traversal
Whole newsletter archives, greatly improves recommendation efficiency.
And the sorting technique used in categorization module is as shown in Fig. 2: carrying out news point using textCNN sorting technique
Class.Secondly in microblogging interest digging module, as shown in Fig. 3:
That is, for step S101, that is, it is described obtain a plurality of newsletter archive from news platform, it is determining with it is described
The process of the corresponding news lexical item set of newsletter archive, can specifically include:
Step S1011: a plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, is obtained more
A newsletter archive set.
Specifically, the newsletter archive collection is combined into and is classified by textCNN sorting technique.For utilizing
The process that textCNN classifies to newsletter archive, main includes determining article matrix, convolutional layer, pond layer and output layer
Four parts, due to realizing the emphasis of process and non-present invention, here not reinflated description.
Step S1012: news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively is determined.
Correspondingly, step S103, that is, the calculating news lexical item set and the user interest lexical item set
Similarity process, can specifically include:
Step S1031: it according to the user interest lexical item set, is filtered out from each newsletter archive set described
The interested newsletter archive set of microblog users.
Step S1032: each newsletter archive in the newsletter archive set is traversed, it is corresponding to calculate the newsletter archive
The similarity of news lexical item set and the user interest lexical item set.
Therefore, the whole realization process of the present embodiment is visible referring to Fig. 3, firstly, classify to a large amount of newsletter archive,
Multiple newsletter archive set are obtained, then determine the corresponding news lexical item set of each newsletter archive in newsletter archive set;
User's dynamic lexical item set, Yong Huzi are determined according to user's dynamic text, subscriber data text, social information text respectively simultaneously
Expect lexical item set, social information lexical item set;Finally filter out user may interested newsletter archive set, and further from
User is filtered out in newsletter archive set may interested newsletter archive.
To sum up, a kind of news push method merging microblogging interest digging provided by the present embodiment, can be flat from news
Platform obtains a plurality of newsletter archive, and determines corresponding news lexical item set, moreover it is possible to the microblogging state text of microblog users is obtained, and
Corresponding user interest lexical item set is determined, finally by the phase calculated between news lexical item set and user interest lexical item set
Like degree, decide whether newsletter archive being pushed to microblog users.As it can be seen that this process employs in this mainstream social application of microblogging
There are the characteristics that user interest abundant, user interest lexical item set is determined according to microblogging state text, and push away accordingly for user
Recommend news.Since microblogging state text is including at least user's dynamic text, subscriber data text and three, social information text
Content has expanded the text for extracting user interest to a certain extent, alleviates cold start-up problem, in addition, because microblogging shape
What state text reflected is the interest of microblog users itself, therefore is more able to satisfy the individual demand of user.
Below to a kind of news push Installation practice progress for merging microblogging interest digging provided in an embodiment of the present invention
It introduces, a kind of news push device merging microblogging interest digging described below a kind of merges microblogging interest with above-described
The news push method of excavation can correspond to each other reference.
Referring to fig. 4, which specifically includes:
News lexical item set determining module 401: for obtaining a plurality of newsletter archive, the determining and news from news platform
The corresponding news lexical item set of text, wherein the news lexical item set includes the news lexical item in the newsletter archive, is also wrapped
Include the word frequency of the news lexical item.
User interest lexical item set determining module 402: for obtaining the microblogging state text of microblog users, it is determining with it is described
The corresponding user interest lexical item set of microblogging state text, wherein the microblogging state text include at least user's dynamic text,
Subscriber data text and social information text, the user interest lexical item set include emerging in the microblogging state text
Interesting lexical item further includes the word frequency of the interest lexical item.
Similarity calculation module 403: for calculating the phase of the news lexical item set with the user interest lexical item set
Like degree.
Similarity judgment module 404: for judging whether the similarity is greater than preset threshold.
News push module 405: for when the similarity is greater than preset threshold, the newsletter archive to be pushed to institute
State microblog users.
A kind of news push device of fusion microblogging interest digging of the present embodiment is micro- for realizing a kind of fusion above-mentioned
The news push method of rich interest digging, therefore the visible fusion microblogging one of above of specific embodiment in the device is emerging
The embodiment part for the news push method that interest is excavated, for example, news lexical item set determining module 401, user interest lexical item collection
Determining module 402, similarity calculation module 403, similarity judgment module 404, news push module 405 are closed, reality is respectively used to
Step S101, S102, S103, S104, S105 in a kind of now above-mentioned news push method for merging microblogging interest digging.So
Its specific embodiment is referred to the description of corresponding various pieces embodiment, herein not reinflated introduction.
In addition, due to the present embodiment a kind of fusion microblogging interest digging news push device for realizing above-mentioned one
The news push method of kind fusion microblogging interest digging, therefore its effect is corresponding with the effect of the above method, it is no longer superfluous here
It states.
In addition, the present invention also provides a kind of news push equipment for merging microblogging interest digging, comprising:
Memory: for storing computer program;
Processor: it for executing the computer program, realizes weighed a kind of fusion microblogging interest digging as described in going up
The step of news push method.
Finally, being deposited on the computer readable storage medium the present invention also provides a kind of computer readable storage medium
Computer program is contained, a kind of fusion microblogging interest digging as described above is realized when the computer program is executed by processor
News push method the step of.
Due to the news push equipment and computer readable storage medium of a kind of fusion microblogging interest digging of the present embodiment
For realizing a kind of news push method for merging microblogging interest digging above-mentioned, therefore its specific implementation may refer to
The description of embodiment of the method is stated, here not reinflated introduction, furthermore, it is to be understood that its effect is corresponding with the effect of the above method, this
In also repeat no more.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with it is other
The difference of embodiment, same or similar part may refer to each other between each embodiment.For being filled disclosed in embodiment
For setting, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part
Explanation.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure
And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and
The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These
Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession
Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered
Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
Above to a kind of method, apparatus of news push for merging microblogging interest digging provided by the present invention, equipment with
And computer readable storage medium is described in detail.Specific case used herein is to the principle of the present invention and embodiment party
Formula is expounded, and the above description of the embodiment is only used to help understand the method for the present invention and its core ideas.It should refer to
It out, for those skilled in the art, without departing from the principle of the present invention, can also be to the present invention
Some improvement and modification can also be carried out, and these improvements and modifications also fall within the scope of protection of the claims of the present invention.
Claims (10)
1. a kind of news push method for merging microblogging interest digging characterized by comprising
A plurality of newsletter archive is obtained from news platform, determines news lexical item set corresponding with the newsletter archive, wherein described
News lexical item set includes the news lexical item in the newsletter archive, further includes the word frequency of the news lexical item;
The microblogging state text of microblog users is obtained, determines user interest lexical item set corresponding with the microblogging state text,
Wherein, the microblogging state text includes at least user's dynamic text, subscriber data text and social information text, described
User interest lexical item set includes the interest lexical item in the microblogging state text, further includes the word frequency of the interest lexical item;
Calculate the similarity of the news lexical item set Yu the user interest lexical item set;
Judge whether the similarity is greater than preset threshold;
If more than the newsletter archive is then pushed to the microblog users.
2. the method as described in claim 1, which is characterized in that it is described to obtain a plurality of newsletter archive from news platform, determine with
The corresponding news lexical item set of the newsletter archive includes:
A plurality of newsletter archive is obtained from news platform, and is classified to the newsletter archive, multiple newsletter archive set are obtained;
Determine news lexical item set corresponding with newsletter archive each in the newsletter archive set respectively;
It is described to calculate the news lexical item set and the similarity of the user interest lexical item set includes:
According to the user interest lexical item set, it is interested from each newsletter archive set to filter out the microblog users
Newsletter archive set;
Traverse each newsletter archive in the newsletter archive set, calculate the corresponding news lexical item set of the newsletter archive with
The similarity of the user interest lexical item set.
3. method according to claim 2, which is characterized in that the newsletter archive collection is combined into through textCNN sorting technique
Classified.
4. method according to claim 2, which is characterized in that the calculation formula of the word frequency of the news lexical item are as follows:
Wij=[num (tij)/total (Di)] * log [N/nij] * D (i), wherein num (tij) is news lexical item i in news
The number occurred in text j, total (Di) are the sum of the lexical item in newsletter archive j, and N is the news text comprising newsletter archive j
The sum of newsletter archive in this set, nij are the quantity of the newsletter archive in the newsletter archive set comprising news lexical item i, D
It (i) is timeliness parameter.
5. the method as described in claim 1, which is characterized in that it is described obtain microblog users microblogging state text, determine with
The corresponding user interest lexical item set of the microblogging state text includes:
Obtain user's dynamic text, subscriber data text and the social information text of microblog users;
Determine user dynamic lexical item set I1 corresponding with user's dynamic text;
Determine subscriber data lexical item set I2 corresponding with the subscriber data text;
Determine social information lexical item set I3 corresponding with the social information text;
Determine user interest lexical item set Y=U*I1+V*I2+W*I3, wherein U is the pre- of user's dynamic lexical item set I1
If weight, V is the default weight of the subscriber data lexical item set I2, and W is the default of the social information lexical item set I3
Weight, and meet U+V+W=1.
6. method as claimed in claim 5, which is characterized in that the news lexical item set and the user interest lexical item set
Similarity calculation formula are as follows:
EXP=Usim (I1, L)+Vsim (I2, L)+Wsim (I3, L), wherein L is news lexical item set, and sim () is similarity
Calculation formula, for determining the similarity between two lexical item sets according to lexical item and word frequency.
7. method as claimed in claim 5, which is characterized in that user's dynamic text includes original content text, forwarding
Content text and comment content text;
Determination user dynamic lexical item set I1 corresponding with user's dynamic text include:
Determine first lexical item set I11 corresponding with the original content text;
Determine second lexical item set I12 corresponding with the forwarding content text;
Determine third lexical item set I13 corresponding with the comment text;
Determine user dynamic lexical item set I1=A*I1+B*I2+C*I3 corresponding with user's dynamic text, wherein A is institute
The default weight of original content text is stated, B is the default weight of the forwarding content text, and C is the default of the comment text
Weight.
8. a kind of news push device for merging microblogging interest digging characterized by comprising
News lexical item set determining module: for obtaining a plurality of newsletter archive, the determining and newsletter archive pair from news platform
The news lexical item set answered, wherein the news lexical item set includes the news lexical item in the newsletter archive, further includes described
The word frequency of news lexical item;
User interest lexical item set determining module: for obtaining the microblogging state text of microblog users, the determining and microblogging shape
The corresponding user interest lexical item set of state text, wherein the microblogging state text includes at least user's dynamic text, Yong Huzi
Expect text and social information text, the user interest lexical item set include the interest lexical item in the microblogging state text,
It further include the word frequency of the interest lexical item;
Similarity calculation module: for calculating the similarity of the news lexical item set Yu the user interest lexical item set;
Similarity judgment module: for judging whether the similarity is greater than preset threshold;
News push module: for when the similarity is greater than preset threshold, the newsletter archive to be pushed to the microblogging
User.
9. a kind of news push equipment for merging microblogging interest digging characterized by comprising
Memory: for storing computer program;
Processor: for executing the computer program, a kind of fusion microblogging as described in claim 1-7 any one is realized
The step of news push method of interest digging.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program realizes a kind of fusion microblogging as described in claim 1-7 any one when the computer program is executed by processor
The step of news push method of interest digging.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810966477.3A CN109325175A (en) | 2018-08-23 | 2018-08-23 | Merge the news push method, device and equipment of microblogging interest digging |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810966477.3A CN109325175A (en) | 2018-08-23 | 2018-08-23 | Merge the news push method, device and equipment of microblogging interest digging |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109325175A true CN109325175A (en) | 2019-02-12 |
Family
ID=65264459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810966477.3A Pending CN109325175A (en) | 2018-08-23 | 2018-08-23 | Merge the news push method, device and equipment of microblogging interest digging |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325175A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112182351A (en) * | 2020-09-28 | 2021-01-05 | 哈尔滨工业大学(深圳) | News recommendation method and device based on multi-feature fusion |
CN112749341A (en) * | 2021-01-22 | 2021-05-04 | 南京莱斯网信技术研究院有限公司 | Key public opinion recommendation method, readable storage medium and data processing device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104572797A (en) * | 2014-05-12 | 2015-04-29 | 深圳市智搜信息技术有限公司 | Individual service recommendation system and method based on topic model |
CN105868267A (en) * | 2016-03-04 | 2016-08-17 | 江苏工程职业技术学院 | Modeling method for mobile social network user interests |
CN107025310A (en) * | 2017-05-17 | 2017-08-08 | 长春嘉诚信息技术股份有限公司 | A kind of automatic news in real time recommends method |
CN107766576A (en) * | 2017-11-15 | 2018-03-06 | 北京航空航天大学 | A kind of extracting method of microblog users interest characteristics |
-
2018
- 2018-08-23 CN CN201810966477.3A patent/CN109325175A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104572797A (en) * | 2014-05-12 | 2015-04-29 | 深圳市智搜信息技术有限公司 | Individual service recommendation system and method based on topic model |
CN105868267A (en) * | 2016-03-04 | 2016-08-17 | 江苏工程职业技术学院 | Modeling method for mobile social network user interests |
CN107025310A (en) * | 2017-05-17 | 2017-08-08 | 长春嘉诚信息技术股份有限公司 | A kind of automatic news in real time recommends method |
CN107766576A (en) * | 2017-11-15 | 2018-03-06 | 北京航空航天大学 | A kind of extracting method of microblog users interest characteristics |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112182351A (en) * | 2020-09-28 | 2021-01-05 | 哈尔滨工业大学(深圳) | News recommendation method and device based on multi-feature fusion |
CN112749341A (en) * | 2021-01-22 | 2021-05-04 | 南京莱斯网信技术研究院有限公司 | Key public opinion recommendation method, readable storage medium and data processing device |
CN112749341B (en) * | 2021-01-22 | 2024-03-29 | 南京莱斯网信技术研究院有限公司 | Important public opinion recommendation method, readable storage medium and data processing device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Zhou et al. | The state-of-the-art in personalized recommender systems for social networking | |
Manovich | Digital traces in context| 100 billion data rows per second: Media analytics in the early 21st century | |
CN103778148B (en) | Life cycle management method and equipment for data file of Hadoop distributed file system | |
Sjöberg et al. | Digital me: Controlling and making sense of my digital footprint | |
CN104484431B (en) | A kind of multi-source Personalize News webpage recommending method based on domain body | |
CN103810162B (en) | The method and system of recommendation network information | |
WO2015058309A1 (en) | Systems and methods for determining influencers in a social data network | |
CN102760128A (en) | Telecommunication field package recommending method based on intelligent customer service robot interaction | |
CN103207917B (en) | The method of mark content of multimedia, the method and system of generation content recommendation | |
CN103473036B (en) | A kind of input method skin method for pushing and system | |
CN104077415A (en) | Searching method and device | |
CN105787025A (en) | Network platform public account classifying method and device | |
CN110888990A (en) | Text recommendation method, device, equipment and medium | |
CN108874722A (en) | A kind of electronic-book reading system | |
CN108765052A (en) | Electric business recommendation/method for pushing and device, storage medium and computing device | |
CN107977420A (en) | The abstract extraction method, apparatus and readable storage medium storing program for executing of a kind of evolved document | |
CN109325175A (en) | Merge the news push method, device and equipment of microblogging interest digging | |
CN108733669A (en) | A kind of personalized digital media content recommendation system and method based on term vector | |
CN108153781A (en) | The method and apparatus for extracting the keyword of business scope | |
CN103595747A (en) | User-information recommending method and system | |
CN106792616A (en) | Mobile terminal user's surfing flow analysis method and system | |
CN104462061A (en) | Word extraction method and word extraction device | |
CN103942213A (en) | Data paging method and device | |
CN109214856A (en) | The method for digging and device, computer equipment and readable medium that user is intended to | |
CN107861993A (en) | A kind of data processing method and device for running application program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |
|
RJ01 | Rejection of invention patent application after publication |