CN100412866C - 一种网络内容引用自动发现的方法 - Google Patents
一种网络内容引用自动发现的方法 Download PDFInfo
- Publication number
- CN100412866C CN100412866C CNB2005101096002A CN200510109600A CN100412866C CN 100412866 C CN100412866 C CN 100412866C CN B2005101096002 A CNB2005101096002 A CN B2005101096002A CN 200510109600 A CN200510109600 A CN 200510109600A CN 100412866 C CN100412866 C CN 100412866C
- Authority
- CN
- China
- Prior art keywords
- content
- web
- quotation
- web site
- cited
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000004458 analytical method Methods 0.000 claims abstract description 20
- 238000005516 engineering process Methods 0.000 claims abstract description 13
- 230000001186 cumulative effect Effects 0.000 claims 1
- 238000001514 detection method Methods 0.000 abstract description 2
- 230000010365 information processing Effects 0.000 abstract description 2
- 238000013475 authorization Methods 0.000 abstract 1
- 238000000265 homogenisation Methods 0.000 abstract 1
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005101096002A CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005101096002A CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1770159A CN1770159A (zh) | 2006-05-10 |
CN100412866C true CN100412866C (zh) | 2008-08-20 |
Family
ID=36751460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2005101096002A Expired - Fee Related CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100412866C (zh) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1845134B (zh) * | 2006-05-16 | 2010-05-12 | 武汉大学 | 基于计算机网络的防转载或/和反剽窃监控方法 |
CN101187925B (zh) * | 2006-11-17 | 2010-11-03 | 北京酷讯科技有限公司 | 自动优化爬虫的抓取方法 |
CN101231641B (zh) * | 2007-01-22 | 2010-05-19 | 北大方正集团有限公司 | 一种自动分析互联网上热点主题传播过程的方法及系统 |
CN100498790C (zh) * | 2007-02-06 | 2009-06-10 | 腾讯科技(深圳)有限公司 | 一种搜索方法和系统 |
WO2010022301A2 (en) * | 2008-08-21 | 2010-02-25 | Dolby Laboratories Licensing Corporation | Networking with media fingerprints |
CN101355587B (zh) * | 2008-09-17 | 2012-05-23 | 杭州华三通信技术有限公司 | Url信息获取方法和装置及搜索引擎实现方法及系统 |
CN101980529A (zh) * | 2010-09-21 | 2011-02-23 | 天栢宽带网络科技(上海)有限公司 | 支持三网融合的视频服务系统 |
CN103281213B (zh) * | 2013-04-18 | 2016-04-06 | 西安交通大学 | 一种网络流量内容提取和分析检索方法 |
CN103716690B (zh) * | 2013-12-27 | 2017-09-01 | 广州华多网络科技有限公司 | 多媒体直播举报的方法、终端、服务器及系统 |
CN104133868B (zh) * | 2014-07-21 | 2018-01-05 | 厦门大学 | 一种用于垂直爬虫数据分类整合的策略 |
CN108829659B (zh) * | 2018-05-04 | 2021-02-09 | 北京中科闻歌科技股份有限公司 | 一种引用识别方法、设备和计算机可存储介质 |
-
2005
- 2005-10-28 CN CNB2005101096002A patent/CN100412866C/zh not_active Expired - Fee Related
Non-Patent Citations (5)
Title |
---|
. . |
基于WWW的文本信息挖掘. 邹涛,黄源,张福炎.情报学报,第18卷第4期. 1999 |
基于WWW的文本信息挖掘. 邹涛,黄源,张福炎.情报学报,第18卷第4期. 1999 * |
浅谈网络信息挖掘. 高月,梁本亮.通讯电源技术,第21卷第1期. 2004 |
浅谈网络信息挖掘. 高月,梁本亮.通讯电源技术,第21卷第1期. 2004 * |
Also Published As
Publication number | Publication date |
---|---|
CN1770159A (zh) | 2006-05-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100412866C (zh) | 一种网络内容引用自动发现的方法 | |
US8015162B2 (en) | Detecting duplicate and near-duplicate files | |
CN107977575B (zh) | 一种基于私有云平台的代码组成分析系统和方法 | |
US8458207B2 (en) | Using anchor text to provide context | |
US8185530B2 (en) | Method and system for web document clustering | |
Shinzato et al. | Tsubaki: An open search engine infrastructure for developing information access methodology | |
WO2020164276A1 (zh) | 网页数据爬取方法、装置、系统及计算机可读存储介质 | |
JP4919515B2 (ja) | 重複する文書の検出および表示機能 | |
US20070022085A1 (en) | Techniques for unsupervised web content discovery and automated query generation for crawling the hidden web | |
CN106685936B (zh) | 网页篡改的检测方法及装置 | |
US20040167876A1 (en) | Method and apparatus for improved web scraping | |
CN1609845A (zh) | 用于改善由机器自动生成的摘要的可读性的方法和装置 | |
CN102436563A (zh) | 一种检测页面篡改的方法及装置 | |
CN102446255A (zh) | 一种检测页面篡改的方法及装置 | |
CN102591965A (zh) | 一种黑链检测的方法及装置 | |
CN108416034B (zh) | 基于金融异构大数据的信息采集系统及其控制方法 | |
EP1677215B1 (en) | Methods and apparatus for the evalution of aspects of a web page | |
CN104281619A (zh) | 搜索结果排序系统及方法 | |
US8521746B1 (en) | Detection of bounce pad sites | |
Jadidoleslamy | Search result merging and ranking strategies in meta-search engines: a survey | |
CN104778232B (zh) | 一种基于长查询的搜索结果的优化方法和装置 | |
CN106599304B (zh) | 一种针对中小型网站的模块化用户检索意图建模方法 | |
CN114880540A (zh) | 一种基于智慧金融文本评论的智能提醒方法 | |
JP2007188134A (ja) | 索引ファイルを用いた文書検索の方法 | |
KR101079802B1 (ko) | 웹사이트 검색 방법 및 시스템과 웹사이트 검색 장치 및이를 위한 기록매체 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220913 Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031 Patentee after: New founder holdings development Co.,Ltd. Patentee after: PEKING University FOUNDER R & D CENTER Patentee after: Peking University Address before: 100871, fangzheng building, 298 Fu Cheng Road, Beijing, Haidian District Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd. Patentee before: PEKING University FOUNDER R & D CENTER Patentee before: Peking University |
|
TR01 | Transfer of patent right | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20080820 |
|
CF01 | Termination of patent right due to non-payment of annual fee |