CN1770159A - 一种网络内容引用自动发现的方法 - Google Patents
一种网络内容引用自动发现的方法 Download PDFInfo
- Publication number
- CN1770159A CN1770159A CN200510109600.2A CN200510109600A CN1770159A CN 1770159 A CN1770159 A CN 1770159A CN 200510109600 A CN200510109600 A CN 200510109600A CN 1770159 A CN1770159 A CN 1770159A
- Authority
- CN
- China
- Prior art keywords
- content
- web
- web site
- automatically finding
- quotation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 238000004458 analytical method Methods 0.000 claims description 18
- 238000005516 engineering process Methods 0.000 claims description 10
- 230000001186 cumulative effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005101096002A CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CNB2005101096002A CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1770159A true CN1770159A (zh) | 2006-05-10 |
CN100412866C CN100412866C (zh) | 2008-08-20 |
Family
ID=36751460
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNB2005101096002A Expired - Fee Related CN100412866C (zh) | 2005-10-28 | 2005-10-28 | 一种网络内容引用自动发现的方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN100412866C (zh) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2008098502A1 (fr) * | 2007-02-06 | 2008-08-21 | Tencent Technology (Shenzhen) Company Limited | Procédé et dispositif destinés à créer un index et procédé et système de récupération |
CN1845134B (zh) * | 2006-05-16 | 2010-05-12 | 武汉大学 | 基于计算机网络的防转载或/和反剽窃监控方法 |
CN101231641B (zh) * | 2007-01-22 | 2010-05-19 | 北大方正集团有限公司 | 一种自动分析互联网上热点主题传播过程的方法及系统 |
CN101187925B (zh) * | 2006-11-17 | 2010-11-03 | 北京酷讯科技有限公司 | 自动优化爬虫的抓取方法 |
CN101980529A (zh) * | 2010-09-21 | 2011-02-23 | 天栢宽带网络科技(上海)有限公司 | 支持三网融合的视频服务系统 |
CN102216945A (zh) * | 2008-08-21 | 2011-10-12 | 杜比实验室特许公司 | 通过媒体指纹进行联网 |
CN101355587B (zh) * | 2008-09-17 | 2012-05-23 | 杭州华三通信技术有限公司 | Url信息获取方法和装置及搜索引擎实现方法及系统 |
CN103281213A (zh) * | 2013-04-18 | 2013-09-04 | 西安交通大学 | 一种网络流量内容提取和分析检索方法 |
CN103716690A (zh) * | 2013-12-27 | 2014-04-09 | 广州华多网络科技有限公司 | 多媒体直播举报的方法、终端、服务器及系统 |
CN104133868A (zh) * | 2014-07-21 | 2014-11-05 | 厦门大学 | 一种用于垂直爬虫数据分类整合的策略 |
CN108829659A (zh) * | 2018-05-04 | 2018-11-16 | 北京中科闻歌科技股份有限公司 | 一种引用识别方法、设备和计算机可存储介质 |
-
2005
- 2005-10-28 CN CNB2005101096002A patent/CN100412866C/zh not_active Expired - Fee Related
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1845134B (zh) * | 2006-05-16 | 2010-05-12 | 武汉大学 | 基于计算机网络的防转载或/和反剽窃监控方法 |
CN101187925B (zh) * | 2006-11-17 | 2010-11-03 | 北京酷讯科技有限公司 | 自动优化爬虫的抓取方法 |
CN101231641B (zh) * | 2007-01-22 | 2010-05-19 | 北大方正集团有限公司 | 一种自动分析互联网上热点主题传播过程的方法及系统 |
WO2008098502A1 (fr) * | 2007-02-06 | 2008-08-21 | Tencent Technology (Shenzhen) Company Limited | Procédé et dispositif destinés à créer un index et procédé et système de récupération |
US9684907B2 (en) | 2008-08-21 | 2017-06-20 | Dolby Laboratories Licensing Corporation | Networking with media fingerprints |
CN102216945A (zh) * | 2008-08-21 | 2011-10-12 | 杜比实验室特许公司 | 通过媒体指纹进行联网 |
CN102216945B (zh) * | 2008-08-21 | 2013-04-17 | 杜比实验室特许公司 | 通过媒体指纹进行联网 |
CN101355587B (zh) * | 2008-09-17 | 2012-05-23 | 杭州华三通信技术有限公司 | Url信息获取方法和装置及搜索引擎实现方法及系统 |
CN101980529A (zh) * | 2010-09-21 | 2011-02-23 | 天栢宽带网络科技(上海)有限公司 | 支持三网融合的视频服务系统 |
CN103281213B (zh) * | 2013-04-18 | 2016-04-06 | 西安交通大学 | 一种网络流量内容提取和分析检索方法 |
CN103281213A (zh) * | 2013-04-18 | 2013-09-04 | 西安交通大学 | 一种网络流量内容提取和分析检索方法 |
CN103716690A (zh) * | 2013-12-27 | 2014-04-09 | 广州华多网络科技有限公司 | 多媒体直播举报的方法、终端、服务器及系统 |
CN104133868A (zh) * | 2014-07-21 | 2014-11-05 | 厦门大学 | 一种用于垂直爬虫数据分类整合的策略 |
CN104133868B (zh) * | 2014-07-21 | 2018-01-05 | 厦门大学 | 一种用于垂直爬虫数据分类整合的策略 |
CN108829659A (zh) * | 2018-05-04 | 2018-11-16 | 北京中科闻歌科技股份有限公司 | 一种引用识别方法、设备和计算机可存储介质 |
CN108829659B (zh) * | 2018-05-04 | 2021-02-09 | 北京中科闻歌科技股份有限公司 | 一种引用识别方法、设备和计算机可存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN100412866C (zh) | 2008-08-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1770159A (zh) | 一种网络内容引用自动发现的方法 | |
US8015162B2 (en) | Detecting duplicate and near-duplicate files | |
CN107977575B (zh) | 一种基于私有云平台的代码组成分析系统和方法 | |
US8458207B2 (en) | Using anchor text to provide context | |
US6615209B1 (en) | Detecting query-specific duplicate documents | |
US6959326B1 (en) | Method, system, and program for gathering indexable metadata on content at a data repository | |
WO2020164276A1 (zh) | 网页数据爬取方法、装置、系统及计算机可读存储介质 | |
CN102054028B (zh) | 一种网络爬虫系统实现页面渲染功能的方法 | |
US20060294052A1 (en) | Unsupervised, automated web host dynamicity detection, dead link detection and prerequisite page discovery for search indexed web pages | |
US20090070366A1 (en) | Method and system for web document clustering | |
US20070022085A1 (en) | Techniques for unsupervised web content discovery and automated query generation for crawling the hidden web | |
US20040193636A1 (en) | Method for identifying related pages in a hyperlinked database | |
KR20060048778A (ko) | 정보 검색 시스템에서의 문구 기반 서치 | |
KR20060048779A (ko) | 정보 검색 시스템에서의 문구 식별 | |
CN1728134A (zh) | 基于超文本的多语言网络信息搜索方法和系统 | |
CN102779169A (zh) | 一种基于html标签的网页正文提取方法及装置 | |
US8001462B1 (en) | Updating search engine document index based on calculated age of changed portions in a document | |
EP1677215B1 (en) | Methods and apparatus for the evalution of aspects of a web page | |
Wills et al. | Studying the impact of more complete server information on web caching | |
US8521746B1 (en) | Detection of bounce pad sites | |
Jadidoleslamy | Search result merging and ranking strategies in meta-search engines: a survey | |
Peshave et al. | How search engines work: And a web crawler application | |
CN110245275B (zh) | 一种大规模相似新闻标题快速归一化方法 | |
CN1677389A (zh) | 一种基于关键字搜索的移动互联网智能信息搜索引擎 | |
Qinghua | Application of WebCrawler in Information Search and Data Mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220913 Address after: 3007, Hengqin international financial center building, No. 58, Huajin street, Hengqin new area, Zhuhai, Guangdong 519031 Patentee after: New founder holdings development Co.,Ltd. Patentee after: PEKING University FOUNDER R & D CENTER Patentee after: Peking University Address before: 100871, fangzheng building, 298 Fu Cheng Road, Beijing, Haidian District Patentee before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd. Patentee before: PEKING University FOUNDER R & D CENTER Patentee before: Peking University |
|
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20080820 |