CN103365902B - The appraisal procedure and device of internet news - Google Patents

The appraisal procedure and device of internet news Download PDF

Info

Publication number
CN103365902B
CN103365902B CN201210097667.9A CN201210097667A CN103365902B CN 103365902 B CN103365902 B CN 103365902B CN 201210097667 A CN201210097667 A CN 201210097667A CN 103365902 B CN103365902 B CN 103365902B
Authority
CN
China
Prior art keywords
news
website
content
satellite information
title
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201210097667.9A
Other languages
Chinese (zh)
Other versions
CN103365902A (en
Inventor
白龙
梁如峰
刘杰
王松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Beijing Founder Electronics Co Ltd
Original Assignee
Peking University Founder Group Co Ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Founder Group Co Ltd, Beijing Founder Electronics Co Ltd filed Critical Peking University Founder Group Co Ltd
Priority to CN201210097667.9A priority Critical patent/CN103365902B/en
Publication of CN103365902A publication Critical patent/CN103365902A/en
Application granted granted Critical
Publication of CN103365902B publication Critical patent/CN103365902B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a kind of appraisal procedure of internet news, including:Obtain the top-line title of setting website;Participle clustering processing is carried out to title, to determine focus therein;Top-line content and satellite information corresponding to focus are obtained by search engine;Acquired news content and satellite information are estimated.The invention provides a kind of apparatus for evaluating of internet news, including:Acquisition module, the top-line title for obtaining setting website;Hot spot module, for carrying out participle clustering processing to title, to determine focus therein;Search engine, for obtaining top-line content and satellite information corresponding to focus;Evaluation module, is estimated to acquired news content and satellite information.The present invention improves the efficiency and accuracy of internet news assessment.

Description

The appraisal procedure and device of internet news
Technical field
The present invention relates to Internet technical field, in particular to a kind of assessment of internet news propagating influence Method and apparatus.
Background technology
The current assessment to internet news propagating influence mainly by the way of artificial statistics, mainly there is following two The mode of kind.
First, related news are searched for by from each main flow search engine, obtains relevant information returning result number, and it is related new The information such as the Web realease time of news, and check that news click volume, reprinting amount, and news analysis content etc. are believed into news pages Breath carries out collect statistics.
2nd, by major news portal websites, carrying out artificial combing website news information, son under each portal website is counted The information such as related news exposure in column, information number, content coverage rate are studied and judged roughly, evaluate the influence of related news Degree.Or the two combination is carried out into comprehensive assessment to dissemination of news influence.
Artificial enquiry, manual statistical estimation dissemination of news influence power method have following deficiency:
1st, inefficiency.When in face of internet mass information, although can quickly limit letter by search-engine tool Breath scope, but the thousands of relevant information returned in face of search engine, manually check, manual statistical method due to using, The output for assessing data is often more long, while consume a large amount of manpower and materials, assessment result Relative Network news it is ageing relative It is delayed.
2nd, news impact force estimation accuracy is not high.Due to cyber journalism data are carried out with can be returned largely during retrieval assessment The relatively low data message of the degree of correlation, this can cause certain negative effect to news impact force estimation, interference effect force estimation value Obtain.
The content of the invention
The present invention is intended to provide the appraisal procedure and device of a kind of internet news, to realize commenting internet news Estimate.
In an embodiment of the present invention, there is provided a kind of appraisal procedure of internet news, including:Obtain setting website Top-line title;Participle clustering processing is carried out to title, to determine focus therein;Focus institute is obtained by search engine The top-line content of correspondence and satellite information;Acquired news content and satellite information are estimated.
In an embodiment of the present invention, there is provided a kind of apparatus for evaluating of internet news, including:Acquisition module, is used for Obtain the top-line title of setting website;Hot spot module, it is therein to determine for carrying out participle clustering processing to title Focus;Search engine, for obtaining top-line content and satellite information corresponding to focus;Evaluation module, to acquired News content and satellite information are estimated.
The appraisal procedure and device of the internet news of the above embodiment of the present invention are simultaneously automatic true because of automatic acquisition news Determine focus therein and be estimated, so overcoming the less efficient problem of manual evaluation, improve internet news assessment Efficiency and accuracy.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this hair Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 shows the flow chart of the appraisal procedure of internet news according to embodiments of the present invention;
Fig. 2 shows the schematic diagram of the apparatus for evaluating of internet news according to embodiments of the present invention.
Specific embodiment
Describe the present invention in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the flow chart of the appraisal procedure of internet news according to embodiments of the present invention, including:
Step S10, obtains the top-line title of setting website, for example, the website to having limited carries out data Retrieval, the top news of each news website column of timing acquisition, and headline, the link information of returning, affiliated net will be gathered Stand, column, news position sequence, the relevant information such as region be put in storage in the lump, and classification storage management is carried out by different columns;
Step S20, participle clustering processing is carried out to title, to determine focus therein, for example, the top news returned to collection The title of news carries out the news hot word of word segmentation processing, the corresponding news hot word of extraction, and the news information foundation extraction to returning Clustered, finally determined hot news;
Step S30, top-line content and satellite information corresponding to focus are obtained by search engine;
Step S40, is estimated to acquired news content and satellite information.
Contents processing can be carried out to the news information for gathering, extract news in brief information, news keyword, and to news Information sets up index, and for later stage information analysis, displaying is used.
This method obtains news and automatically determines focus therein and be estimated automatically, so overcoming manual evaluation effect The relatively low problem of rate, improves the efficiency and accuracy of internet news assessment.
Preferably, step S20 is divided to for following two collecting parts:
A) content, the part gathered data, are obtained primarily to post analysis assessment news influence.Collection content is such as Under:News briefing time, news information title, news in brief, body content, the click volume of the news, reprinting amount, comment Number, and the news news analysis information.If the news information of collection has been present, corresponding data is updated, to reflect letter The latest tendency of breath.
B) satellite information, the part gathered data, are obtained primarily to post analysis dissemination of news influences.Collection information It is as follows:News website, specifically issues column, chained address, related news title, and respective links address, while new to correlation News carries out information gathering, and collection information is identical with a) content.If the related news of collection are varied from, newly-increased part is entered Row respective handling, to reflect the latest tendency of information.
Preferably, step S40 includes that dissemination of news influences force estimation, specifically includes:Assess the propagation effect of news item PowerWherein, InfoD1Represent propagating influence of the news on the i of website, WiIt is website i's Informational influence degree weight.
Preferably, InfoD is seti=(Sdi+Hdi)Tdi;Wherein, SdiRepresent that the news influences in the propagation range of website i Power, HdiRepresent news temperature influence power of the news in website i, Tdt=e-αt, t represents the issuing time of the news to today Time gap, α is decay factor.
News on network attracts a large amount of concerns and comments in the first meeting of issue, but elapses over time, in the unit interval News amount of reading and comment number can be fewer and feweri, that is to say, that no matter how attractive news information is, and also can slowly fade out people Sight line.Time attenuation function Td=e-αtMedia event attenuation process can be simulated.Time attenuation function, is performance news sheet Body it is ageing, represent the time trend of news time using with decay similar decay formula of radioactive element here.Ginseng Number t is the news duration, that is, the time gap given a news briefing time to today, α=1 is decay factor, can be according to user Demand be configured.
News hot value is multiplied by the corresponding function of time to dissemination of news range value, that is, reacts media event in the website Propagation effect degree.Propagation effect degree of the news in other news websites can be similarly obtained, and then by calculating average weighted The method of value obtains dissemination of news of the news in the range of retrieval website influences force value.
Preferably, this method also includes:Assessment Hdi=H1i+H2i;Wherein, H1iRepresent temperature of the news same day in website i Seniority among brothers and sisters value;H2iRepresent the news proxima luce (prox. luc) website i temperature seniority among brothers and sisters value and the same day the temperature seniority among brothers and sisters value of website i difference.
Preferably, this method also includes:Assessment Sdi=W1i+W2i+W3i;W1iIt is the news in the column report rate of website i, That is column average coverage rate;W2iFor the news website i news cluster numerical value and the news website i number ratio, i.e., newly Hear content and derive news topic rate;W3iFor the news website i money order receipt to be signed and returned to the sender numerical value and the news website i reading numeric ratio, That is news information money order receipt to be signed and returned to the sender rate.
Preferably, current each news site hot news information can be shown, so that user is to current network hot news Overall understanding is done in the distribution of each news site, current hotspot news can be specified in each news website distribution situation.Also can be accurate It is determined that the specific Reporting of position news, and the headline information related to this news is obtained, and facilitate user's lateral extension, pay close attention to The new news topic that the news is derived.
Preferably, step S40 includes dissemination of news impact evaluation, specifically includes:By in acquired news content News website in the issuing web site of news, information forwarding quantity and click volume comment number, and satellite information, specifically issue column Mesh, chained address, related news title and respective links address are counted, to assess the dissemination of news disturbance degree of news.Its In, following information is equally contained in the corresponding news content of related news title of acquisition:The issuing web site of news, information forwarding Quantity, click volume comment number, these information can be equally used for assessing dissemination of news impact evaluation.
Preferably, step S40 is assessed including news persistence, is specifically included:By in satellite information not in the same time News is counted in the distribution situation of website, information number and relevant information reprinting amount, click volume, assesses the news of news Persistence.For example, by setting theme of news, giving more sustained attention the news evolution, occur from media event, to media event Continuing fermentation, final media event terminates.The characteristic represented in different times by the whole process whole media event of concern, that is, led to Cross to not in the same time related news in website distribution situation, information number, and relevant information reprinting amount, the combined factors such as click volume Evaluate the long lasting effect power of media event.
Preferably, this method also includes:By the issuing web site of news in the phase satellite information to having obtained, net is reprinted Stand, information forwarding quantity, click volume, reprinting amount, comment number is counted, and obtain the news disseminates approach, Information expansion model Enclose, receiver situations such as;And text analyzing is carried out by corresponding news analysis, audient's comment content is carried out at cluster Reason, forms the news viewpoint that receiver is held to the news.
Fig. 2 shows the schematic diagram of the apparatus for evaluating of internet news according to embodiments of the present invention, including:
Acquisition module 10, the top-line title for obtaining setting website;
Hot spot module 20, for carrying out participle clustering processing to title, to determine focus therein;
Search engine 30, for obtaining top-line content and satellite information corresponding to focus;
Evaluation module 40, is estimated to acquired news content and satellite information.
The present apparatus obtains news and automatically determines focus therein and be estimated automatically, so overcoming manual evaluation effect The relatively low problem of rate, improves the efficiency and accuracy of internet news assessment.
Preferably, evaluation module is used to assess the propagating influence of news itemWherein, InfoDiPropagating influence of the news on the i of website is represented, Wi is the informational influence degree weight of website i, InfoDi=(Sdi+ Hdi)Tdi, SdiRepresent propagation range influence power of the news in website i, HdiRepresent that the news influences in the news temperature of website i Power, Tdt=e-αt, t represents the issuing time of the news to the time gap of today, and α is decay factor.
To sum up, can be formed on dissemination of news, influence, media event persistence by above-mentioned main information handling process Tentatively study and judge, with reference to the ways of presentation of the data modes such as corresponding chart, can more directly perceived, easily obtain dissemination of news influence The assessment of power, compared with Traditional Man manual mode is counted, the present invention substantially increases assessment efficiency.
Obviously, those skilled in the art should be understood that above-mentioned of the invention each module or each step can be with general Computing device realize that they can be concentrated on single computing device, or be distributed in multiple computing devices and constituted Network on, alternatively, the program code that they can be can perform with computing device be realized, it is thus possible to they are stored Performed by computing device in the storage device, or they be fabricated to each integrated circuit modules respectively, or by they In multiple modules or step single integrated circuit module is fabricated to realize.So, the present invention is not restricted to any specific Hardware and software is combined.
The preferred embodiments of the present invention are the foregoing is only, is not intended to limit the invention, for the skill of this area For art personnel, the present invention can have various modifications and variations.It is all within the spirit and principles in the present invention, made any repair Change, equivalent, improvement etc., should be included within the scope of the present invention.

Claims (7)

1. a kind of appraisal procedure of internet news, it is characterised in that including:
Obtain the top-line title of setting website;
Participle clustering processing is carried out to the title, to determine focus therein;
Top-line content and satellite information corresponding to the focus are obtained by search engine;
Acquired news content and satellite information are estimated, it is described that acquired news content and satellite information are carried out Assessment includes that dissemination of news influences force estimation, specifically includes:
One propagating influence of the news of assessment
Wherein, InfoDiRepresent propagating influence of the news on the i of website, WiIt is the informational influence degree weight of website i, sets InfoDi=(Sdi+Hdi)Tdi, SdiRepresent propagation range influence power of the news in website i, HdiRepresent the news website i's News temperature influence power, Tdi=e-αt, t represents the issuing time of the news to the time gap of today, and α is decay factor.
2. method according to claim 1, it is characterised in that top news corresponding to the focus is obtained by search engine new The content and satellite information of news include:
The content is obtained, including:News briefing time, news information title, news in brief, body content, the news Click volume, reprinting amount, comment on number, and the news news analysis information;
The satellite information is obtained, including:News website, specifically issues column, chained address, related news title, and accordingly Chained address, while obtaining the content to related news.
3. method according to claim 1, it is characterised in that also include:
Assessment Hdi=H1i+H2i
Wherein, H1iRepresent temperature seniority among brothers and sisters value of the news same day in website i;H2iRepresent temperature of the news proxima luce (prox. luc) in website i The difference of seniority among brothers and sisters value and the same day in the temperature seniority among brothers and sisters value of website i.
4. method according to claim 1, it is characterised in that also include:
Assessment Sdi=W1i+W2i+W3i
W1iIt is the news in the column report rate of website i;W2iFor the news website i news cluster numerical value and the news in net Stand the number ratio of i;W3iFor the news website i money order receipt to be signed and returned to the sender numerical value and the news website i reading numeric ratio.
5. method according to claim 1, it is characterised in that be estimated to acquired news content and satellite information Including dissemination of news impact evaluation, specifically include:
Quantity and click volume is forwarded to comment on number by the issuing web site of news described in acquired news content, information, with And news website in the satellite information, specifically issue column, chained address, related news title and respective links address are carried out Statistics, to assess the dissemination of news disturbance degree of the news.
6. method according to claim 1, it is characterised in that be estimated to acquired news content and satellite information Including the assessment of news persistence, specifically include:
By to news described in the satellite information not in the same time in the distribution situation of website, information number and related letter Breath reprinting amount, click volume are counted, and assess the news persistence of the news.
7. a kind of apparatus for evaluating of internet news, it is characterised in that including:
Acquisition module, the top-line title for obtaining setting website;
Hot spot module, for carrying out participle clustering processing to the title, to determine focus therein;
Search engine, for obtaining top-line content and satellite information corresponding to the focus;
Evaluation module, is estimated to acquired news content and satellite information, and the evaluation module is used to assess an institute State the propagating influence of newsWherein, InfoDiRepresent propagation shadow of the news on the i of website Ring power, WiIt is the informational influence degree weight of website i, InfoDi=(Sdi+Hdi)Tdi, SdiRepresent that the news is wide in the propagation of website i Degree influence power, HdiRepresent news temperature influence power of the news in website i, Tdi=e-αt, t represents the issuing time of the news extremely The time gap of today, α is decay factor.
CN201210097667.9A 2012-03-31 2012-03-31 The appraisal procedure and device of internet news Expired - Fee Related CN103365902B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210097667.9A CN103365902B (en) 2012-03-31 2012-03-31 The appraisal procedure and device of internet news

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210097667.9A CN103365902B (en) 2012-03-31 2012-03-31 The appraisal procedure and device of internet news

Publications (2)

Publication Number Publication Date
CN103365902A CN103365902A (en) 2013-10-23
CN103365902B true CN103365902B (en) 2017-06-20

Family

ID=49367266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210097667.9A Expired - Fee Related CN103365902B (en) 2012-03-31 2012-03-31 The appraisal procedure and device of internet news

Country Status (1)

Country Link
CN (1) CN103365902B (en)

Families Citing this family (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI601088B (en) * 2014-10-06 2017-10-01 Chunghwa Telecom Co Ltd Topic management network public opinion evaluation management system and method
CN104331420A (en) * 2014-10-13 2015-02-04 北京奇虎科技有限公司 Method and device for judging news releasing position significance
CN104657496B (en) * 2015-03-09 2018-08-14 杭州朗和科技有限公司 A kind of method and apparatus calculating heatrate value
CN106815228B (en) * 2015-11-27 2020-03-03 北京国双科技有限公司 Method and device for selecting class name of search keyword
CN105630929B (en) * 2015-12-22 2019-08-30 北京奇虎科技有限公司 Based on the method and device for commenting on determining news recommendation weight
CN106919627A (en) * 2015-12-28 2017-07-04 北京国双科技有限公司 The treating method and apparatus of hot word
CN105824803B (en) * 2016-03-31 2018-10-30 北京奇艺世纪科技有限公司 A kind of determination method and device of focus incident title
CN107632984A (en) * 2016-07-18 2018-01-26 阿里巴巴集团控股有限公司 A kind of cluster data table shows methods, devices and systems
CN107784010B (en) * 2016-08-29 2021-12-17 南京尚网网络科技有限公司 Method and equipment for determining popularity information of news theme
CN106934049B (en) * 2017-03-16 2020-08-07 天闻数媒科技(北京)有限公司 News question selection analysis method and device
CN107239497B (en) * 2017-05-02 2020-11-03 广东万丈金数信息技术股份有限公司 Hot content search method and system
CN107749869A (en) * 2017-09-15 2018-03-02 合肥英泽信息科技有限公司 A kind of news website background management system based on Cloud Server
CN108153818B (en) * 2017-11-29 2021-08-10 成都东方盛行电子有限责任公司 Big data based clustering method
CN108197292A (en) * 2018-01-22 2018-06-22 成都睿码科技有限责任公司 A kind of measure and system of dissemination of news amount
CN108804594A (en) * 2018-05-28 2018-11-13 国家计算机网络与信息安全管理中心 A kind of construction method and device of news content full-text search engine
CN109032906A (en) * 2018-07-17 2018-12-18 郑州升达经贸管理学院 A kind of appraisal procedure and its assessment device of internet news
CN109145246A (en) * 2018-07-31 2019-01-04 成都华栖云科技有限公司 A kind of news virtual click amount implementation method based on paas media cloud multi-tenant platform
CN109325180B (en) * 2018-09-21 2021-01-05 北京字节跳动网络技术有限公司 Article abstract pushing method and device, terminal equipment, server and storage medium
CN109275031B (en) * 2018-09-25 2021-09-28 有米科技股份有限公司 Video popularity evaluation method and device and electronic equipment
CN111949853A (en) * 2019-04-30 2020-11-17 北京智慧星光信息技术有限公司 Monitoring control method for internet information
CN111143688B (en) * 2019-12-31 2021-03-02 南京新一代人工智能研究院有限公司 Evaluation method and system based on mobile news client
CN111523027B (en) * 2020-04-16 2023-08-01 武汉有牛科技有限公司 Automatic data news writing robot based on blockchain technology
CN111506851A (en) * 2020-04-16 2020-08-07 创新奇智(上海)科技有限公司 Portal website grade calculation method, news recommendation method and news recommendation device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100952391B1 (en) * 2005-04-14 2010-04-14 에스케이커뮤니케이션즈 주식회사 System and method for evaluating contents on the internet network and computer readable medium processing the method
CN101122904A (en) * 2006-08-08 2008-02-13 任喜军 Internet webpage value evaluation, balancing method
CN102096680A (en) * 2009-12-15 2011-06-15 北京大学 Method and device for analyzing information validity

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"一种基于信息检索技术的网络影响力分析方法";杨伟杰等;《软件学报》;20090930;第20卷(第9期);论文第2398页第5-8段、2399页第1-3段、2404页倒数第一段以及图1 *
"突发时间热点话题识别系统及关键问题研究";陈莉萍等;《计算机工程与应用》;20111231(第32期);参见论文第20页以及图1 *

Also Published As

Publication number Publication date
CN103365902A (en) 2013-10-23

Similar Documents

Publication Publication Date Title
CN103365902B (en) The appraisal procedure and device of internet news
Ratkiewicz et al. Characterizing and modeling the dynamics of online popularity
CN102880712B (en) Method and system for sequencing searched network videos
CN102932206B (en) The method and system of monitoring website access information
CN103186612B (en) A kind of method of classified vocabulary, system and implementation method
CN103593446A (en) Flow quality analyzing method and device
CN102426590B (en) Quality evaluation method and device
CN103593350A (en) Method and device for recommending promotion keyword price parameters
CN104834731A (en) Recommendation method and device for self-media information
CN107103483A (en) The method and device that outdoor advertising is delivered
KR101566616B1 (en) Advertisement decision supporting system using big data-processing and method thereof
CN107229754B (en) Information sorting method and device, electronic equipment and storage medium
CN104598450A (en) Popularity analysis method and system of network public opinion event
CN105224681B (en) Customer requirement retrieval method and system based on family's place of working context environmental
CN103020212A (en) Method and device for finding hot videos based on user query logs in real time
CN103605714A (en) Method and device for identifying abnormal data of websites
CN105930507A (en) Method and apparatus for obtaining Web browsing interest of user
CN102779190A (en) Rapid detection method for hot issues of timing sequence massive network news
CN110363427A (en) Model quality evaluation method and apparatus
CN109885656B (en) Microblog forwarding prediction method and device based on quantification heat degree
CN104933475A (en) Network forwarding behavior prediction method and apparatus
CN116342192B (en) Internet automobile industry advertisement putting effect monitoring method based on big data
CN104123318A (en) Method and system for displaying interest points in map
CN102402563A (en) Network information screening method and device
CN103544307A (en) Multi-search-engine automatic comparison and evaluation method independent of document library

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20220615

Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031

Patentee after: New founder holdings development Co.,Ltd.

Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District Cheng Fu Road 298, founder building, 5 floor

Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170620

CF01 Termination of patent right due to non-payment of annual fee