CN101499098B - 一种网页评估值的确定及运用的方法、系统 - Google Patents

一种网页评估值的确定及运用的方法、系统 Download PDF

Info

Publication number
CN101499098B
CN101499098B CN2009101181501A CN200910118150A CN101499098B CN 101499098 B CN101499098 B CN 101499098B CN 2009101181501 A CN2009101181501 A CN 2009101181501A CN 200910118150 A CN200910118150 A CN 200910118150A CN 101499098 B CN101499098 B CN 101499098B
Authority
CN
China
Prior art keywords
evaluation value
webpage
same
web page
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2009101181501A
Other languages
English (en)
Chinese (zh)
Other versions
CN101499098A (zh
Inventor
陈华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN2009101181501A priority Critical patent/CN101499098B/zh
Publication of CN101499098A publication Critical patent/CN101499098A/zh
Priority to HK10100369.1A priority patent/HK1132819B/xx
Priority to US12/660,606 priority patent/US8364667B2/en
Priority to EP10749048A priority patent/EP2404267A4/en
Priority to PCT/US2010/000648 priority patent/WO2010101634A1/en
Priority to JP2011552939A priority patent/JP5329680B2/ja
Application granted granted Critical
Publication of CN101499098B publication Critical patent/CN101499098B/zh
Priority to US13/683,155 priority patent/US8788489B2/en
Priority to US14/304,674 priority patent/US9223880B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • G06F16/24578Query processing with adaptation to user needs using ranking
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
CN2009101181501A 2009-03-04 2009-03-04 一种网页评估值的确定及运用的方法、系统 Expired - Fee Related CN101499098B (zh)

Priority Applications (8)

Application Number Priority Date Filing Date Title
CN2009101181501A CN101499098B (zh) 2009-03-04 2009-03-04 一种网页评估值的确定及运用的方法、系统
HK10100369.1A HK1132819B (en) 2010-01-13 A method and system for determining and applying a webpage evaluation value
US12/660,606 US8364667B2 (en) 2009-03-04 2010-03-01 Evaluation of web pages
PCT/US2010/000648 WO2010101634A1 (en) 2009-03-04 2010-03-02 Evaluation of web pages
EP10749048A EP2404267A4 (en) 2009-03-04 2010-03-02 EVALUATION OF INTERNET PAGES
JP2011552939A JP5329680B2 (ja) 2009-03-04 2010-03-02 ウェブページの評価
US13/683,155 US8788489B2 (en) 2009-03-04 2012-11-21 Evaluation of web pages
US14/304,674 US9223880B2 (en) 2009-03-04 2014-06-13 Evaluation of web pages

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009101181501A CN101499098B (zh) 2009-03-04 2009-03-04 一种网页评估值的确定及运用的方法、系统

Publications (2)

Publication Number Publication Date
CN101499098A CN101499098A (zh) 2009-08-05
CN101499098B true CN101499098B (zh) 2012-07-11

Family

ID=40946170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009101181501A Expired - Fee Related CN101499098B (zh) 2009-03-04 2009-03-04 一种网页评估值的确定及运用的方法、系统

Country Status (5)

Country Link
US (3) US8364667B2 (enExample)
EP (1) EP2404267A4 (enExample)
JP (1) JP5329680B2 (enExample)
CN (1) CN101499098B (enExample)
WO (1) WO2010101634A1 (enExample)

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101499098B (zh) * 2009-03-04 2012-07-11 阿里巴巴集团控股有限公司 一种网页评估值的确定及运用的方法、系统
US8645457B2 (en) * 2009-10-05 2014-02-04 Tynt Multimedia Inc. System and method for network object creation and improved search result reporting
US20110289216A1 (en) * 2010-05-21 2011-11-24 Timothy Szeto System and Method for Generating Subnets and Using Such Subnets for Controlling Access to Web Content
CN102314435A (zh) * 2010-06-30 2012-01-11 腾讯科技(深圳)有限公司 搜索网页内容的方法及系统
CN101969445B (zh) * 2010-11-03 2014-12-17 中国电信股份有限公司 防御DDoS和CC攻击的方法和装置
CN102231165B (zh) * 2011-07-11 2013-01-09 浙江大学 一种基于用户停留时间分析的个性化网页搜索排序方法
US9189563B2 (en) 2011-11-02 2015-11-17 Microsoft Technology Licensing, Llc Inheritance of rules across hierarchical levels
US9558274B2 (en) * 2011-11-02 2017-01-31 Microsoft Technology Licensing, Llc Routing query results
US8909628B1 (en) * 2012-01-24 2014-12-09 Google Inc. Detecting content scraping
US9191291B2 (en) * 2012-09-14 2015-11-17 Salesforce.Com, Inc. Detection and handling of aggregated online content using decision criteria to compare similar or identical content items
US11386181B2 (en) * 2013-03-15 2022-07-12 Webroot, Inc. Detecting a change to the content of information displayed to a user of a website
US11928606B2 (en) 2013-03-15 2024-03-12 TSG Technologies, LLC Systems and methods for classifying electronic documents
US9298814B2 (en) 2013-03-15 2016-03-29 Maritz Holdings Inc. Systems and methods for classifying electronic documents
CN103177106A (zh) * 2013-03-27 2013-06-26 百度在线网络技术(北京)有限公司 检索方法及设备
US9411786B2 (en) * 2013-07-08 2016-08-09 Adobe Systems Incorporated Method and apparatus for determining the relevancy of hyperlinks
CN103399957A (zh) * 2013-08-21 2013-11-20 百度在线网络技术(北京)有限公司 搜索方法、系统、搜索引擎和客户端
CN104571935A (zh) * 2013-10-18 2015-04-29 宇宙互联有限公司 全局调度系统及方法
CN104572340A (zh) * 2013-10-18 2015-04-29 宇宙互联有限公司 增量备份系统及方法
CN103605704B (zh) * 2013-11-08 2017-02-01 深圳大学 大量url数据任意字段索引及检索方法
CN103902687B (zh) * 2014-03-25 2017-07-04 百度在线网络技术(北京)有限公司 一种搜索结果的生成方法及装置
CN104090976B (zh) * 2014-07-21 2017-06-23 北京奇虎科技有限公司 搜索引擎爬虫抓取网页的方法及装置
CN105630802A (zh) * 2014-10-30 2016-06-01 阿里巴巴集团控股有限公司 网页去重方法及装置
CN105447081A (zh) * 2015-11-04 2016-03-30 国云科技股份有限公司 面向云平台的一种政务舆情监控方法
CN106776609B (zh) * 2015-11-19 2020-05-22 北京国双科技有限公司 网站转载数量的统计方法及装置
US10235426B2 (en) * 2016-06-29 2019-03-19 International Business Machines Corporation Proposing a copy area in a document
CN107168997B (zh) * 2017-03-30 2021-07-20 百度在线网络技术(北京)有限公司 基于人工智能的网页原创评估方法、装置及存储介质
CN107357891A (zh) * 2017-07-12 2017-11-17 中云开源数据技术(上海)有限公司 一种主页链接推荐方法
CN110569335B (zh) 2018-03-23 2022-05-27 百度在线网络技术(北京)有限公司 基于人工智能的三元组校验方法、装置及存储介质
CN113763167B (zh) * 2021-08-11 2023-11-17 杭州盈火网络科技有限公司 一种基于复杂网络的黑名单挖掘方法
CN116450634B (zh) * 2023-06-15 2023-09-29 中新宽维传媒科技有限公司 一种数据源权重评估方法及其相关装置

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6658423B1 (en) * 2001-01-24 2003-12-02 Google, Inc. Detecting duplicate and near-duplicate files
CN101154224A (zh) * 2006-09-30 2008-04-02 阿里巴巴公司 一种网址导航方法及系统

Family Cites Families (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5544352A (en) 1993-06-14 1996-08-06 Libertech, Inc. Method and apparatus for indexing, searching and displaying data
US5933827A (en) 1996-09-25 1999-08-03 International Business Machines Corporation System for identifying new web pages of interest to a user
US6144962A (en) 1996-10-15 2000-11-07 Mercury Interactive Corporation Visualization of web sites and hierarchical data structures
US6012087A (en) 1997-01-14 2000-01-04 Netmind Technologies, Inc. Unique-change detection of dynamic web pages using history tables of signatures
US6421675B1 (en) 1998-03-16 2002-07-16 S. L. I. Systems, Inc. Search engine
US7308413B1 (en) * 1999-05-05 2007-12-11 Tota Michael J Process for creating media content based upon submissions received on an electronic multi-media exchange
US6269361B1 (en) * 1999-05-28 2001-07-31 Goto.Com System and method for influencing a position on a search result list generated by a computer network search engine
US6832222B1 (en) 1999-06-24 2004-12-14 International Business Machines Corporation Technique for ensuring authorized access to the content of dynamic web pages stored in a system cache
US6675170B1 (en) 1999-08-11 2004-01-06 Nec Laboratories America, Inc. Method to efficiently partition large hyperlinked databases by hyperlink structure
US6643641B1 (en) 2000-04-27 2003-11-04 Russell Snyder Web search engine with graphic snapshots
US6785666B1 (en) 2000-07-11 2004-08-31 Revenue Science, Inc. Method and system for parsing navigation information
US6757675B2 (en) * 2000-07-24 2004-06-29 The Regents Of The University Of California Method and apparatus for indexing document content and content comparison with World Wide Web search service
US20050060643A1 (en) * 2003-08-25 2005-03-17 Miavia, Inc. Document similarity detection and classification system
US7346839B2 (en) * 2003-09-30 2008-03-18 Google Inc. Information retrieval based on historical data
GB2430507A (en) * 2005-09-21 2007-03-28 Stephen Robert Ives System for managing the display of sponsored links together with search results on a mobile/wireless device
US7904725B2 (en) * 2006-03-02 2011-03-08 Microsoft Corporation Verification of electronic signatures
WO2007137232A2 (en) * 2006-05-20 2007-11-29 Personics Holdings Inc. Method of modifying audio content
US7660804B2 (en) 2006-08-16 2010-02-09 Microsoft Corporation Joint optimization of wrapper generation and template detection
US9654495B2 (en) * 2006-12-01 2017-05-16 Websense, Llc System and method of analyzing web addresses
US7676520B2 (en) * 2007-04-12 2010-03-09 Microsoft Corporation Calculating importance of documents factoring historical importance
US20080288509A1 (en) * 2007-05-16 2008-11-20 Google Inc. Duplicate content search
US10698886B2 (en) * 2007-08-14 2020-06-30 John Nicholas And Kristin Gross Trust U/A/D Temporal based online search and advertising
US20090327278A1 (en) * 2008-06-26 2009-12-31 Baran-Sneh Alex System and method for ranking web content
KR101086530B1 (ko) * 2008-10-02 2011-11-23 엔에이치엔(주) 웹 문서 원본 판별 방법 및 시스템, 이를 위한 웹 문서 이력 정보 제공 방법 및 시스템
US8695091B2 (en) * 2009-02-11 2014-04-08 Sophos Limited Systems and methods for enforcing policies for proxy website detection using advertising account ID
CN101499098B (zh) * 2009-03-04 2012-07-11 阿里巴巴集团控股有限公司 一种网页评估值的确定及运用的方法、系统

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6658423B1 (en) * 2001-01-24 2003-12-02 Google, Inc. Detecting duplicate and near-duplicate files
CN101154224A (zh) * 2006-09-30 2008-04-02 阿里巴巴公司 一种网址导航方法及系统

Also Published As

Publication number Publication date
US20100228718A1 (en) 2010-09-09
US9223880B2 (en) 2015-12-29
US8364667B2 (en) 2013-01-29
HK1132819A1 (en) 2010-03-05
JP5329680B2 (ja) 2013-10-30
US8788489B2 (en) 2014-07-22
US20150006506A1 (en) 2015-01-01
EP2404267A4 (en) 2012-12-05
EP2404267A1 (en) 2012-01-11
WO2010101634A1 (en) 2010-09-10
US20130144873A1 (en) 2013-06-06
JP2012519901A (ja) 2012-08-30
CN101499098A (zh) 2009-08-05

Similar Documents

Publication Publication Date Title
CN101499098B (zh) 一种网页评估值的确定及运用的方法、系统
US8156152B2 (en) Content oriented index and search method and system
CN102761627B (zh) 基于终端访问统计的云网址推荐方法及系统及相关设备
US7630973B2 (en) Method for identifying related pages in a hyperlinked database
TWI512506B (zh) Sorting method and device for search results
JP4919515B2 (ja) 重複する文書の検出および表示機能
US8150846B2 (en) Content searching and configuration of search results
US9031960B1 (en) Query image search
US10025855B2 (en) Federated community search
JP5494454B2 (ja) 検索結果生成方法、検索結果生成プログラムおよび検索システム
US7860971B2 (en) Anti-spam tool for browser
CN103744856B (zh) 联动性扩展搜索方法及装置、系统
CN101641694A (zh) 通过若干搜索引擎实现的联合搜索
AU2011227327A1 (en) Indexing and searching employing virtual documents
CN102937975B (zh) 一种网页搜索设备和方法
CN113127596B (zh) 一种全文检索方法、系统、电子设备及存储介质
CN102654879B (zh) 搜索方法及装置
CN100524300C (zh) 内容定向的索引和搜索方法与系统
CN103678601A (zh) 一种范文检索请求的处理方法和装置
CN103902687A (zh) 一种搜索结果的生成方法及装置
TWI497322B (zh) The method of determining and using the method of web page evaluation
CN101196911A (zh) 选取资源实名的方法、系统及装置
CN101408881A (zh) 生成二进制文件内容签名的方法及系统
Lee et al. Geographically-sensitive link analysis
HK1132819B (en) A method and system for determining and applying a webpage evaluation value

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1132819

Country of ref document: HK

C14 Grant of patent or utility model
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: GR

Ref document number: 1132819

Country of ref document: HK

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120711