CN102087648A - Method and system for fetching news comment page - Google Patents
Method and system for fetching news comment page Download PDFInfo
- Publication number
- CN102087648A CN102087648A CN2009102420552A CN200910242055A CN102087648A CN 102087648 A CN102087648 A CN 102087648A CN 2009102420552 A CN2009102420552 A CN 2009102420552A CN 200910242055 A CN200910242055 A CN 200910242055A CN 102087648 A CN102087648 A CN 102087648A
- Authority
- CN
- China
- Prior art keywords
- page
- url
- link
- news
- news analysis
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 238000004458 analytical method Methods 0.000 claims description 90
- 230000015572 biosynthetic process Effects 0.000 claims description 46
- 238000004364 calculation method Methods 0.000 claims description 14
- 230000009193 crawling Effects 0.000 claims description 11
- 239000000284 extract Substances 0.000 claims description 5
- 230000010354 integration Effects 0.000 abstract description 3
- 238000005755 formation reaction Methods 0.000 description 29
- 238000010586 diagram Methods 0.000 description 4
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000002950 deficient Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
Images
Abstract
Description
Number of regulation | Rule content |
1 | Comprise " comment " in the link text, eigenwert adds 24.5 |
2 | Comprise " follow-up " or " message " or " comment " in the link text, eigenwert adds 22.5 |
3 | Comprise " saying " and " sentence " in the link text, eigenwert adds 4 |
4 | Comprise " saying " and " I " in the link text, eigenwert adds 4 |
5 | Comprise " online friend " in the link text, eigenwert adds 4 |
6 | Comprise " issue " or " checking " or " click " in the link text, eigenwert adds 10 |
7 | Comprise " checking " and " click " in the link text, eigenwert adds 110 |
8 | Comprise " having " or " all " or " owning " or " other " in the link text, eigenwert adds 10 |
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910242055.2A CN102087648B (en) | 2009-12-03 | 2009-12-03 | Method and system for fetching news comment page |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910242055.2A CN102087648B (en) | 2009-12-03 | 2009-12-03 | Method and system for fetching news comment page |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102087648A true CN102087648A (en) | 2011-06-08 |
CN102087648B CN102087648B (en) | 2013-06-19 |
Family
ID=44099461
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200910242055.2A Expired - Fee Related CN102087648B (en) | 2009-12-03 | 2009-12-03 | Method and system for fetching news comment page |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102087648B (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102279894A (en) * | 2011-09-19 | 2011-12-14 | 嘉兴亿言堂信息科技有限公司 | Method for searching, integrating and providing comment information based on semantics and searching system |
CN102722580A (en) * | 2012-06-07 | 2012-10-10 | 杭州电子科技大学 | Method for downloading video comments dynamically generated in video websites |
CN102810110A (en) * | 2012-05-07 | 2012-12-05 | 北京京东世纪贸易有限公司 | Method and system for acquiring web text data |
CN102821088A (en) * | 2012-05-07 | 2012-12-12 | 北京京东世纪贸易有限公司 | System and method for acquiring network data |
CN103488675A (en) * | 2013-07-11 | 2014-01-01 | 哈尔滨工程大学 | Automatic precise extraction device for multi-webpage news comment contents |
CN103593344A (en) * | 2012-08-13 | 2014-02-19 | 北大方正集团有限公司 | Information acquisition method and device |
CN103617229A (en) * | 2013-11-25 | 2014-03-05 | 北京奇虎科技有限公司 | Method and device for establishing relevant-webpage data base |
CN104408198A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Method and device for acquiring webpage contents |
CN104504016A (en) * | 2014-12-10 | 2015-04-08 | 河海大学 | User-oriented automatic WEB information extracting method |
CN105138357A (en) * | 2015-08-11 | 2015-12-09 | 中山大学 | Method and device for implementing mobile application operation assistant |
CN107045507A (en) * | 2016-02-05 | 2017-08-15 | 北京国双科技有限公司 | Web page crawl method and device |
CN109241402A (en) * | 2018-07-31 | 2019-01-18 | 成都华栖云科技有限公司 | A kind of virtual comment machine introduction method based on news content |
CN111339242A (en) * | 2020-02-26 | 2020-06-26 | 广东小天才科技有限公司 | Comment data processing method, comment data display method, server and client |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100401301C (en) * | 2006-05-30 | 2008-07-09 | 南京大学 | Body learning based intelligent subject-type network reptile system configuration method |
CN100461184C (en) * | 2007-07-10 | 2009-02-11 | 北京大学 | Subject crawling method based on link hierarchical classification in network search |
CN101441662B (en) * | 2008-11-28 | 2010-12-22 | 北京交通大学 | Topic information acquisition method based on network topology |
CN101561814B (en) * | 2009-05-08 | 2012-05-09 | 华中科技大学 | Topic crawler system based on social labels |
-
2009
- 2009-12-03 CN CN200910242055.2A patent/CN102087648B/en not_active Expired - Fee Related
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102279894B (en) * | 2011-09-19 | 2013-01-09 | 嘉兴亿言堂信息科技有限公司 | Method for searching, integrating and providing comment information based on semantics and searching system |
CN102279894A (en) * | 2011-09-19 | 2011-12-14 | 嘉兴亿言堂信息科技有限公司 | Method for searching, integrating and providing comment information based on semantics and searching system |
CN102810110B (en) * | 2012-05-07 | 2015-08-05 | 北京京东世纪贸易有限公司 | Obtain the method and system of network text data |
CN102810110A (en) * | 2012-05-07 | 2012-12-05 | 北京京东世纪贸易有限公司 | Method and system for acquiring web text data |
CN102821088A (en) * | 2012-05-07 | 2012-12-12 | 北京京东世纪贸易有限公司 | System and method for acquiring network data |
CN102821088B (en) * | 2012-05-07 | 2015-12-16 | 北京京东世纪贸易有限公司 | Obtain the system and method for network data |
CN102722580A (en) * | 2012-06-07 | 2012-10-10 | 杭州电子科技大学 | Method for downloading video comments dynamically generated in video websites |
CN103593344A (en) * | 2012-08-13 | 2014-02-19 | 北大方正集团有限公司 | Information acquisition method and device |
CN103593344B (en) * | 2012-08-13 | 2016-09-21 | 北大方正集团有限公司 | A kind of information collecting method and device |
CN103488675A (en) * | 2013-07-11 | 2014-01-01 | 哈尔滨工程大学 | Automatic precise extraction device for multi-webpage news comment contents |
CN103617229A (en) * | 2013-11-25 | 2014-03-05 | 北京奇虎科技有限公司 | Method and device for establishing relevant-webpage data base |
CN104504016A (en) * | 2014-12-10 | 2015-04-08 | 河海大学 | User-oriented automatic WEB information extracting method |
CN104408198A (en) * | 2014-12-15 | 2015-03-11 | 北京国双科技有限公司 | Method and device for acquiring webpage contents |
CN104408198B (en) * | 2014-12-15 | 2018-07-17 | 北京国双科技有限公司 | The acquisition methods and device of Webpage content |
CN105138357A (en) * | 2015-08-11 | 2015-12-09 | 中山大学 | Method and device for implementing mobile application operation assistant |
CN105138357B (en) * | 2015-08-11 | 2018-05-01 | 中山大学 | A kind of implementation method and its device of mobile application operation assistant |
CN107045507A (en) * | 2016-02-05 | 2017-08-15 | 北京国双科技有限公司 | Web page crawl method and device |
CN107045507B (en) * | 2016-02-05 | 2020-08-21 | 北京国双科技有限公司 | Webpage crawling method and device |
CN109241402A (en) * | 2018-07-31 | 2019-01-18 | 成都华栖云科技有限公司 | A kind of virtual comment machine introduction method based on news content |
CN111339242A (en) * | 2020-02-26 | 2020-06-26 | 广东小天才科技有限公司 | Comment data processing method, comment data display method, server and client |
Also Published As
Publication number | Publication date |
---|---|
CN102087648B (en) | 2013-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102087648B (en) | Method and system for fetching news comment page | |
CN107229668B (en) | Text extraction method based on keyword matching | |
CN102622445B (en) | User interest perception based webpage push system and webpage push method | |
Bellaachia et al. | Ne-rank: A novel graph-based keyphrase extraction in twitter | |
CN103365924B (en) | A kind of method of internet information search, device and terminal | |
CN106202294B (en) | Related news computing method and device based on keyword and topic model fusion | |
WO2016000555A1 (en) | Methods and systems for recommending social network-based content and news | |
TWI695277B (en) | Automatic website data collection method | |
WO2015149533A1 (en) | Method and device for word segmentation processing on basis of webpage content classification | |
CN107544988B (en) | Method and device for acquiring public opinion data | |
CN103365839A (en) | Recommendation search method and device for search engines | |
CN103294681B (en) | Method and device for generating search result | |
CN103853834B (en) | Text structure analysis-based Web document abstract generation method | |
CN102929928A (en) | Multidimensional-similarity-based personalized news recommendation method | |
CN101894102A (en) | Method and device for analyzing emotion tendentiousness of subjective text | |
CN103336766A (en) | Short text garbage identification and modeling method and device | |
CN106126502A (en) | A kind of emotional semantic classification system and method based on support vector machine | |
CN105138558A (en) | User access content-based real-time personalized information collection method | |
CN103186574A (en) | Method and device for generating searching result | |
Han et al. | HIT at TREC 2012 Microblog Track. | |
CN111104801B (en) | Text word segmentation method, system, equipment and medium based on website domain name | |
CN110457579B (en) | Webpage denoising method and system based on cooperative work of template and classifier | |
CN110134788B (en) | Microblog release optimization method and system based on text mining | |
CN104268230A (en) | Method for detecting objective points of Chinese micro-blogs based on heterogeneous graph random walk | |
CN105512333A (en) | Product comment theme searching method based on emotional tendency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220921 Address after: 100871 No. 5, the Summer Palace Road, Beijing, Haidian District Patentee after: Peking University Patentee after: New founder holdings development Co.,Ltd. Patentee after: BEIJING FOUNDER ELECTRONICS CHIEF INFORMATION TECHNOLOGY Co.,Ltd. Patentee after: BEIJING FOUNDER ELECTRONICS Co.,Ltd. Address before: 100871 No. 5, the Summer Palace Road, Beijing, Haidian District Patentee before: Peking University Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd. Patentee before: BEIJING FOUNDER ELECTRONICS CHIEF INFORMATION TECHNOLOGY Co.,Ltd. Patentee before: BEIJING FOUNDER ELECTRONICS Co.,Ltd. |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20130619 |
|
CF01 | Termination of patent right due to non-payment of annual fee |