CN100585590C - 具有分层存储的索引的搜索引擎 - Google Patents
具有分层存储的索引的搜索引擎 Download PDFInfo
- Publication number
- CN100585590C CN100585590C CN200480033085A CN200480033085A CN100585590C CN 100585590 C CN100585590 C CN 100585590C CN 200480033085 A CN200480033085 A CN 200480033085A CN 200480033085 A CN200480033085 A CN 200480033085A CN 100585590 C CN100585590 C CN 100585590C
- Authority
- CN
- China
- Prior art keywords
- search
- data item
- index
- database
- logically
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99937—Sorting
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US10/705,641 US7240064B2 (en) | 2003-11-10 | 2003-11-10 | Search engine with hierarchically stored indices |
| US10/705,641 | 2003-11-10 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN101189602A CN101189602A (zh) | 2008-05-28 |
| CN100585590C true CN100585590C (zh) | 2010-01-27 |
Family
ID=34552416
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN200480033085A Expired - Fee Related CN100585590C (zh) | 2003-11-10 | 2004-11-09 | 具有分层存储的索引的搜索引擎 |
Country Status (6)
| Country | Link |
|---|---|
| US (1) | US7240064B2 (https=) |
| EP (1) | EP1682993A4 (https=) |
| JP (1) | JP4699379B2 (https=) |
| KR (1) | KR100828232B1 (https=) |
| CN (1) | CN100585590C (https=) |
| WO (1) | WO2005048069A2 (https=) |
Families Citing this family (69)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US7536713B1 (en) | 2002-12-11 | 2009-05-19 | Alan Bartholomew | Knowledge broadcasting and classification system |
| US7725452B1 (en) | 2003-07-03 | 2010-05-25 | Google Inc. | Scheduler for search engine crawler |
| US8042112B1 (en) | 2003-07-03 | 2011-10-18 | Google Inc. | Scheduler for search engine crawler |
| US7516086B2 (en) * | 2003-09-24 | 2009-04-07 | Idearc Media Corp. | Business rating placement heuristic |
| US7822661B1 (en) | 2003-09-24 | 2010-10-26 | SuperMedia LLC | Information distribution system and method utilizing a position adjustment factor |
| US7293016B1 (en) * | 2004-01-22 | 2007-11-06 | Microsoft Corporation | Index partitioning based on document relevance for document indexes |
| US8055553B1 (en) | 2006-01-19 | 2011-11-08 | Verizon Laboratories Inc. | Dynamic comparison text functionality |
| US7580921B2 (en) * | 2004-07-26 | 2009-08-25 | Google Inc. | Phrase identification in an information retrieval system |
| US7711679B2 (en) | 2004-07-26 | 2010-05-04 | Google Inc. | Phrase-based detection of duplicate documents in an information retrieval system |
| US7702618B1 (en) | 2004-07-26 | 2010-04-20 | Google Inc. | Information retrieval system for archiving multiple document versions |
| US7584175B2 (en) | 2004-07-26 | 2009-09-01 | Google Inc. | Phrase-based generation of document descriptions |
| US7580929B2 (en) * | 2004-07-26 | 2009-08-25 | Google Inc. | Phrase-based personalization of searches in an information retrieval system |
| US7567959B2 (en) * | 2004-07-26 | 2009-07-28 | Google Inc. | Multiple index based information retrieval system |
| US7599914B2 (en) | 2004-07-26 | 2009-10-06 | Google Inc. | Phrase-based searching in an information retrieval system |
| US7536408B2 (en) | 2004-07-26 | 2009-05-19 | Google Inc. | Phrase-based indexing in an information retrieval system |
| US7987172B1 (en) * | 2004-08-30 | 2011-07-26 | Google Inc. | Minimizing visibility of stale content in web searching including revising web crawl intervals of documents |
| US9189481B2 (en) * | 2005-05-06 | 2015-11-17 | John M. Nelson | Database and index organization for enhanced document retrieval |
| US7685107B2 (en) * | 2005-06-07 | 2010-03-23 | International Business Machines Corporation | Apparatus, system, and method for scanning a partitioned data set |
| US7831474B2 (en) * | 2005-10-28 | 2010-11-09 | Yahoo! Inc. | System and method for associating an unvalued search term with a valued search term |
| US7778972B1 (en) * | 2005-12-29 | 2010-08-17 | Amazon Technologies, Inc. | Dynamic object replication within a distributed storage system |
| US8554758B1 (en) | 2005-12-29 | 2013-10-08 | Amazon Technologies, Inc. | Method and apparatus for monitoring and maintaining health in a searchable data service |
| US20070198504A1 (en) * | 2006-02-23 | 2007-08-23 | Microsoft Corporation | Calculating level-based importance of a web page |
| EP1862916A1 (en) * | 2006-06-01 | 2007-12-05 | Microsoft Corporation | Indexing Documents for Information Retrieval based on additional feedback fields |
| US7809704B2 (en) * | 2006-06-15 | 2010-10-05 | Microsoft Corporation | Combining spectral and probabilistic clustering |
| US9015197B2 (en) | 2006-08-07 | 2015-04-21 | Oracle International Corporation | Dynamic repartitioning for changing a number of nodes or partitions in a distributed search system |
| US20080033925A1 (en) * | 2006-08-07 | 2008-02-07 | Bea Systems, Inc. | Distributed search analysis |
| US7725470B2 (en) * | 2006-08-07 | 2010-05-25 | Bea Systems, Inc. | Distributed query search using partition nodes |
| US20080059486A1 (en) * | 2006-08-24 | 2008-03-06 | Derek Edwin Pappas | Intelligent data search engine |
| EP1903457B1 (en) * | 2006-09-19 | 2012-05-30 | Exalead | Computer-implemented method, computer program product and system for creating an index of a subset of data |
| US7689548B2 (en) * | 2006-09-22 | 2010-03-30 | Microsoft Corporation | Recommending keywords based on bidding patterns |
| WO2008046098A2 (en) * | 2006-10-13 | 2008-04-17 | Move, Inc. | Multi-tiered cascading crawling system |
| US7783689B2 (en) * | 2006-10-26 | 2010-08-24 | Microsoft Corporation | On-site search engine for the World Wide Web |
| US7693813B1 (en) * | 2007-03-30 | 2010-04-06 | Google Inc. | Index server architecture using tiered and sharded phrase posting lists |
| US7925655B1 (en) | 2007-03-30 | 2011-04-12 | Google Inc. | Query scheduling using hierarchical tiers of index servers |
| US7702614B1 (en) | 2007-03-30 | 2010-04-20 | Google Inc. | Index updating using segment swapping |
| US8166021B1 (en) | 2007-03-30 | 2012-04-24 | Google Inc. | Query phrasification |
| US8086594B1 (en) | 2007-03-30 | 2011-12-27 | Google Inc. | Bifurcated document relevance scoring |
| US8166045B1 (en) | 2007-03-30 | 2012-04-24 | Google Inc. | Phrase extraction using subphrase scoring |
| US20090055436A1 (en) * | 2007-08-20 | 2009-02-26 | Olakunle Olaniyi Ayeni | System and Method for Integrating on Demand/Pull and Push Flow of Goods-and-Services Meta-Data, Including Coupon and Advertising, with Mobile and Wireless Applications |
| US8117223B2 (en) | 2007-09-07 | 2012-02-14 | Google Inc. | Integrating external related phrase information into a phrase-based indexing information retrieval system |
| WO2009078729A1 (en) * | 2007-12-14 | 2009-06-25 | Fast Search & Transfer As | A method for improving search engine efficiency |
| US8024285B2 (en) * | 2007-12-27 | 2011-09-20 | Microsoft Corporation | Determining quality of tier assignments |
| US8103652B2 (en) * | 2008-02-13 | 2012-01-24 | Microsoft Corporation | Indexing explicitly-specified quick-link data for web pages |
| US9135328B2 (en) * | 2008-04-30 | 2015-09-15 | Yahoo! Inc. | Ranking documents through contextual shortcuts |
| US8606627B2 (en) * | 2008-06-12 | 2013-12-10 | Microsoft Corporation | Sponsored search data structure |
| KR100953869B1 (ko) * | 2008-08-04 | 2010-04-20 | 고려대학교 산학협력단 | 데이터 생성 장치 및 방법, 그리고 데이터 검색 장치 및방법 |
| US7733247B1 (en) * | 2008-11-18 | 2010-06-08 | International Business Machines Corporation | Method and system for efficient data transmission with server side de-duplication |
| US20100153371A1 (en) * | 2008-12-16 | 2010-06-17 | Yahoo! Inc. | Method and apparatus for blending search results |
| US20100287129A1 (en) * | 2009-05-07 | 2010-11-11 | Yahoo!, Inc., a Delaware corporation | System, method, or apparatus relating to categorizing or selecting potential search results |
| KR101104112B1 (ko) * | 2009-10-19 | 2012-01-13 | 한국과학기술정보연구원 | 차세대 대용량 저장장치의 동적 색인 관리 시스템 및 그 방법과 그 소스 프로그램을 기록한 기록매체 |
| CN102087646B (zh) * | 2009-12-07 | 2013-03-20 | 北大方正集团有限公司 | 一种索引建立方法及装置 |
| US20110258212A1 (en) * | 2010-04-14 | 2011-10-20 | Microsoft Corporation | Automatic query suggestion generation using sub-queries |
| US9152683B2 (en) * | 2010-10-05 | 2015-10-06 | International Business Machines Corporation | Database-transparent near online archiving and retrieval of data |
| CN101989301B (zh) * | 2010-10-22 | 2012-05-23 | 复旦大学 | 一种支持多数据源的索引维护方法 |
| US8370319B1 (en) * | 2011-03-08 | 2013-02-05 | A9.Com, Inc. | Determining search query specificity |
| US9495453B2 (en) * | 2011-05-24 | 2016-11-15 | Microsoft Technology Licensing, Llc | Resource download policies based on user browsing statistics |
| US8965921B2 (en) | 2012-06-06 | 2015-02-24 | Rackspace Us, Inc. | Data management and indexing across a distributed database |
| US8700583B1 (en) | 2012-07-24 | 2014-04-15 | Google Inc. | Dynamic tiermaps for large online databases |
| US8862566B2 (en) * | 2012-10-26 | 2014-10-14 | Equifax, Inc. | Systems and methods for intelligent parallel searching |
| US9721000B2 (en) * | 2012-12-20 | 2017-08-01 | Microsoft Technology Licensing, Llc | Generating and using a customized index |
| US9501506B1 (en) | 2013-03-15 | 2016-11-22 | Google Inc. | Indexing system |
| US9483568B1 (en) | 2013-06-05 | 2016-11-01 | Google Inc. | Indexing system |
| US9727648B2 (en) | 2014-12-19 | 2017-08-08 | Quixey, Inc. | Time-box constrained searching in a distributed search system |
| US10140299B2 (en) * | 2014-12-31 | 2018-11-27 | Rovi Guides, Inc. | Systems and methods for enhancing search results by way of updating search indices |
| US10380207B2 (en) * | 2015-11-10 | 2019-08-13 | International Business Machines Corporation | Ordering search results based on a knowledge level of a user performing the search |
| US11347798B2 (en) * | 2016-12-29 | 2022-05-31 | Ancestry.Com Operations Inc. | Dynamically-qualified aggregate relationship system in genealogical databases |
| CN109062936B (zh) * | 2018-06-15 | 2023-10-31 | 中国平安人寿保险股份有限公司 | 一种数据查询方法、计算机可读存储介质及终端设备 |
| CN111581237B (zh) * | 2019-02-15 | 2023-06-09 | 阿里巴巴集团控股有限公司 | 数据查询方法、装置、系统及电子设备 |
| CN110990366B (zh) * | 2019-12-04 | 2024-02-23 | 中国农业银行股份有限公司 | 一种提升基于es的日志系统性能的索引分配方法及装置 |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5694593A (en) * | 1994-10-05 | 1997-12-02 | Northeastern University | Distributed computer database system and method |
| US5913215A (en) * | 1996-04-09 | 1999-06-15 | Seymour I. Rubinstein | Browse by prompted keyword phrases with an improved method for obtaining an initial document set |
Family Cites Families (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2977260B2 (ja) * | 1990-09-27 | 1999-11-15 | 株式会社東芝 | 情報提示装置 |
| US7599910B1 (en) * | 1993-11-16 | 2009-10-06 | Hitachi, Ltd. | Method and system of database divisional management for parallel database system |
| US5787435A (en) * | 1996-08-09 | 1998-07-28 | Digital Equipment Corporation | Method for mapping an index of a database into an array of files |
| US6360215B1 (en) * | 1998-11-03 | 2002-03-19 | Inktomi Corporation | Method and apparatus for retrieving documents based on information other than document content |
| NO992269D0 (no) * | 1999-05-10 | 1999-05-10 | Fast Search & Transfer Asa | S°kemotor med todimensjonalt skalerbart, parallell arkitektur |
| US6804675B1 (en) * | 1999-05-11 | 2004-10-12 | Maquis Techtrix, Llc | Online content provider system and method |
| US6507837B1 (en) * | 2000-06-08 | 2003-01-14 | Hyperphrase Technologies, Llc | Tiered and content based database searching |
| NO313399B1 (no) * | 2000-09-14 | 2002-09-23 | Fast Search & Transfer Asa | Fremgangsmate til soking og analyse av informasjon i datanettverk |
| US6778977B1 (en) * | 2001-04-19 | 2004-08-17 | Microsoft Corporation | Method and system for creating a database table index using multiple processors |
| US6928425B2 (en) * | 2001-08-13 | 2005-08-09 | Xerox Corporation | System for propagating enrichment between documents |
| US7565367B2 (en) * | 2002-01-15 | 2009-07-21 | Iac Search & Media, Inc. | Enhanced popularity ranking |
-
2003
- 2003-11-10 US US10/705,641 patent/US7240064B2/en not_active Expired - Lifetime
-
2004
- 2004-11-09 CN CN200480033085A patent/CN100585590C/zh not_active Expired - Fee Related
- 2004-11-09 JP JP2006539808A patent/JP4699379B2/ja not_active Expired - Lifetime
- 2004-11-09 WO PCT/US2004/037507 patent/WO2005048069A2/en not_active Ceased
- 2004-11-09 KR KR1020067009046A patent/KR100828232B1/ko not_active Expired - Lifetime
- 2004-11-09 EP EP04818685A patent/EP1682993A4/en not_active Ceased
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5694593A (en) * | 1994-10-05 | 1997-12-02 | Northeastern University | Distributed computer database system and method |
| US5913215A (en) * | 1996-04-09 | 1999-06-15 | Seymour I. Rubinstein | Browse by prompted keyword phrases with an improved method for obtaining an initial document set |
Also Published As
| Publication number | Publication date |
|---|---|
| EP1682993A4 (en) | 2009-06-24 |
| EP1682993A2 (en) | 2006-07-26 |
| JP2007529791A (ja) | 2007-10-25 |
| WO2005048069A3 (en) | 2007-08-23 |
| US7240064B2 (en) | 2007-07-03 |
| WO2005048069A2 (en) | 2005-05-26 |
| KR20060083229A (ko) | 2006-07-20 |
| JP4699379B2 (ja) | 2011-06-08 |
| US20050102270A1 (en) | 2005-05-12 |
| HK1119798A1 (zh) | 2009-03-13 |
| KR100828232B1 (ko) | 2008-05-07 |
| CN101189602A (zh) | 2008-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN100585590C (zh) | 具有分层存储的索引的搜索引擎 | |
| EP1934823B1 (en) | Click distance determination | |
| US7136851B2 (en) | Method and system for indexing and searching databases | |
| JP2007529791A5 (https=) | ||
| Chang et al. | A signature access method for the starburst database system | |
| JPH07129450A (ja) | 区分されたオブジェクトのデータベースの多層索引構造を生成する方法及びシステム | |
| Puppin et al. | Query-driven document partitioning and collection selection | |
| Wu et al. | Index structures of user profiles for efficient web page filtering services | |
| Deshpande et al. | Efficient online top-k retrieval with arbitrary similarity measures | |
| Koren et al. | Searching and navigating petabyte-scale file systems based on facets | |
| Yadav et al. | Wavelet tree based hybrid geo-textual indexing technique for geographical search | |
| Savoy et al. | Report on the TREC-8 Experiment: Searching on the Web and in Distributed Collections. | |
| HK1119798B (en) | Search engine with hierarchically stored indices | |
| CN106649462A (zh) | 一种针对海量数据全文检索场景的实现方法 | |
| Badan et al. | Keyword-based access to relational data: To reproduce, or to not reproduce? | |
| Gupta | A keyword searching algorithm for search engines | |
| Dyreson | A jumping spider: Restructuring the WWW graph to index concepts that span pages | |
| Babu et al. | Design of a metacrawler for web document retrieval | |
| Leventhal et al. | XII. Query Splitting Using Relevant Documents Instead of Queries In Relevance Feedback | |
| Yu et al. | Distributed Metadata Search for the Cloud. | |
| Daoud | Perfect Hash Functions for Large Web Repositories. | |
| Shejawal et al. | Nearest Neighbor Search Technique Using Keywords and Threshold | |
| Chen et al. | Reverse Mapping of Referral Links from Storage Hierarchy for Web Documents | |
| Kollios et al. | Hashing methods for temporal data | |
| Li et al. | User-assisted similarity estimation for searching related web pages |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1119798 Country of ref document: HK |
|
| C41 | Transfer of patent application or patent right or utility model | ||
| TA01 | Transfer of patent application right |
Effective date of registration: 20090313 Address after: California, USA Applicant after: YAHOO! Inc. Address before: California, USA Applicant before: OVERTURE SERVICES, Inc. |
|
| ASS | Succession or assignment of patent right |
Owner name: YAHOO! CO.,LTD. Free format text: FORMER OWNER: WAFUL TOURS SERVICES Effective date: 20090313 |
|
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| REG | Reference to a national code |
Ref country code: HK Ref legal event code: GR Ref document number: 1119798 Country of ref document: HK |
|
| ASS | Succession or assignment of patent right |
Owner name: FEIYANG MANAGEMENT CO., LTD. Free format text: FORMER OWNER: YAHOO CORP. Effective date: 20150331 |
|
| TR01 | Transfer of patent right |
Effective date of registration: 20150331 Address after: The British Virgin Islands of Tortola Patentee after: Yahoo! Inc. Address before: California, USA Patentee before: YAHOO! Inc. |
|
| CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20100127 Termination date: 20211109 |
|
| CF01 | Termination of patent right due to non-payment of annual fee |