ATE500559T1 - Identifizierung von ausdrücken in einem informationsabfragesystem - Google Patents

Identifizierung von ausdrücken in einem informationsabfragesystem

Info

Publication number
ATE500559T1
ATE500559T1 AT05254645T AT05254645T ATE500559T1 AT E500559 T1 ATE500559 T1 AT E500559T1 AT 05254645 T AT05254645 T AT 05254645T AT 05254645 T AT05254645 T AT 05254645T AT E500559 T1 ATE500559 T1 AT E500559T1
Authority
AT
Austria
Prior art keywords
phrases
documents
identified
expressions
identification
Prior art date
Application number
AT05254645T
Other languages
English (en)
Inventor
Anna Patterson
Original Assignee
Google Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Google Inc filed Critical Google Inc
Application granted granted Critical
Publication of ATE500559T1 publication Critical patent/ATE500559T1/de

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99932Access augmentation or optimizing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99937Sorting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)
AT05254645T 2004-07-26 2005-07-26 Identifizierung von ausdrücken in einem informationsabfragesystem ATE500559T1 (de)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US10/900,021 US7580921B2 (en) 2004-07-26 2004-07-26 Phrase identification in an information retrieval system

Publications (1)

Publication Number Publication Date
ATE500559T1 true ATE500559T1 (de) 2011-03-15

Family

ID=34941843

Family Applications (1)

Application Number Title Priority Date Filing Date
AT05254645T ATE500559T1 (de) 2004-07-26 2005-07-26 Identifizierung von ausdrücken in einem informationsabfragesystem

Country Status (11)

Country Link
US (3) US7580921B2 (de)
EP (1) EP1622053B1 (de)
JP (1) JP4976666B2 (de)
KR (1) KR101190230B1 (de)
CN (1) CN1728142B (de)
AT (1) ATE500559T1 (de)
AU (1) AU2005203240B2 (de)
BR (1) BRPI0503781A (de)
CA (1) CA2513850C (de)
DE (1) DE602005026609D1 (de)
NO (1) NO20053638L (de)

Families Citing this family (114)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7266553B1 (en) * 2002-07-01 2007-09-04 Microsoft Corporation Content data indexing
US7574016B2 (en) * 2003-06-26 2009-08-11 Fotonation Vision Limited Digital image processing using face detection information
US7617205B2 (en) 2005-03-30 2009-11-10 Google Inc. Estimating confidence for query revision models
US9223868B2 (en) 2004-06-28 2015-12-29 Google Inc. Deriving and using interaction profiles
US7567959B2 (en) 2004-07-26 2009-07-28 Google Inc. Multiple index based information retrieval system
US7536408B2 (en) 2004-07-26 2009-05-19 Google Inc. Phrase-based indexing in an information retrieval system
US7711679B2 (en) 2004-07-26 2010-05-04 Google Inc. Phrase-based detection of duplicate documents in an information retrieval system
US7584175B2 (en) 2004-07-26 2009-09-01 Google Inc. Phrase-based generation of document descriptions
US7702618B1 (en) 2004-07-26 2010-04-20 Google Inc. Information retrieval system for archiving multiple document versions
US7599914B2 (en) * 2004-07-26 2009-10-06 Google Inc. Phrase-based searching in an information retrieval system
US7580929B2 (en) * 2004-07-26 2009-08-25 Google Inc. Phrase-based personalization of searches in an information retrieval system
US7580921B2 (en) 2004-07-26 2009-08-25 Google Inc. Phrase identification in an information retrieval system
US7199571B2 (en) * 2004-07-27 2007-04-03 Optisense Network, Inc. Probe apparatus for use in a separable connector, and systems including same
US7962510B2 (en) * 2005-02-11 2011-06-14 Microsoft Corporation Using content analysis to detect spam web pages
US7870147B2 (en) * 2005-03-29 2011-01-11 Google Inc. Query revision using known highly-ranked queries
US7958120B2 (en) 2005-05-10 2011-06-07 Netseer, Inc. Method and apparatus for distributed community finding
US7765214B2 (en) * 2005-05-10 2010-07-27 International Business Machines Corporation Enhancing query performance of search engines using lexical affinities
US9110985B2 (en) 2005-05-10 2015-08-18 Neetseer, Inc. Generating a conceptual association graph from large-scale loosely-grouped content
US8321198B2 (en) * 2005-09-06 2012-11-27 Kabushiki Kaisha Square Enix Data extraction system, terminal, server, programs, and media for extracting data via a morphological analysis
US7971137B2 (en) * 2005-12-14 2011-06-28 Google Inc. Detecting and rejecting annoying documents
US8380721B2 (en) * 2006-01-18 2013-02-19 Netseer, Inc. System and method for context-based knowledge search, tagging, collaboration, management, and advertisement
US8825657B2 (en) 2006-01-19 2014-09-02 Netseer, Inc. Systems and methods for creating, navigating, and searching informational web neighborhoods
US8229733B2 (en) * 2006-02-09 2012-07-24 John Harney Method and apparatus for linguistic independent parsing in a natural language systems
US8843434B2 (en) * 2006-02-28 2014-09-23 Netseer, Inc. Methods and apparatus for visualizing, managing, monetizing, and personalizing knowledge search results on a user interface
US8126874B2 (en) * 2006-05-09 2012-02-28 Google Inc. Systems and methods for generating statistics from search engine query logs
US9646089B2 (en) * 2006-09-18 2017-05-09 John Nicholas and Kristin Gross Trust System and method of modifying ranking for internet accessible documents
US20080082578A1 (en) * 2006-09-29 2008-04-03 Andrew Hogue Displaying search results on a one or two dimensional graph
US9495358B2 (en) 2006-10-10 2016-11-15 Abbyy Infopoisk Llc Cross-language text clustering
US9069750B2 (en) 2006-10-10 2015-06-30 Abbyy Infopoisk Llc Method and system for semantic searching of natural language texts
US9075864B2 (en) 2006-10-10 2015-07-07 Abbyy Infopoisk Llc Method and system for semantic searching using syntactic and semantic analysis
US9892111B2 (en) 2006-10-10 2018-02-13 Abbyy Production Llc Method and device to estimate similarity between documents having multiple segments
US9189482B2 (en) 2012-10-10 2015-11-17 Abbyy Infopoisk Llc Similar document search
US9098489B2 (en) 2006-10-10 2015-08-04 Abbyy Infopoisk Llc Method and system for semantic searching
US9817902B2 (en) 2006-10-27 2017-11-14 Netseer Acquisition, Inc. Methods and apparatus for matching relevant content to user intention
US20080133365A1 (en) * 2006-11-21 2008-06-05 Benjamin Sprecher Targeted Marketing System
US8086600B2 (en) * 2006-12-07 2011-12-27 Google Inc. Interleaving search results
US7966309B2 (en) 2007-01-17 2011-06-21 Google Inc. Providing relevance-ordered categories of information
US7966321B2 (en) 2007-01-17 2011-06-21 Google Inc. Presentation of local results
US8966407B2 (en) 2007-01-17 2015-02-24 Google Inc. Expandable homepage modules
US8326858B2 (en) * 2007-01-17 2012-12-04 Google Inc. Synchronization of fixed and mobile data
US8005822B2 (en) * 2007-01-17 2011-08-23 Google Inc. Location in search queries
US8595204B2 (en) * 2007-03-05 2013-11-26 Microsoft Corporation Spam score propagation for web spam detection
US7702614B1 (en) 2007-03-30 2010-04-20 Google Inc. Index updating using segment swapping
US8166045B1 (en) 2007-03-30 2012-04-24 Google Inc. Phrase extraction using subphrase scoring
US7925655B1 (en) 2007-03-30 2011-04-12 Google Inc. Query scheduling using hierarchical tiers of index servers
US7693813B1 (en) 2007-03-30 2010-04-06 Google Inc. Index server architecture using tiered and sharded phrase posting lists
US8166021B1 (en) 2007-03-30 2012-04-24 Google Inc. Query phrasification
US8086594B1 (en) 2007-03-30 2011-12-27 Google Inc. Bifurcated document relevance scoring
US8977631B2 (en) 2007-04-16 2015-03-10 Ebay Inc. Visualization of reputation ratings
US9239835B1 (en) * 2007-04-24 2016-01-19 Wal-Mart Stores, Inc. Providing information to modules
US7853589B2 (en) * 2007-04-30 2010-12-14 Microsoft Corporation Web spam page classification using query-dependent data
US7788254B2 (en) * 2007-05-04 2010-08-31 Microsoft Corporation Web page analysis using multiple graphs
US7814107B1 (en) 2007-05-25 2010-10-12 Amazon Technologies, Inc. Generating similarity scores for matching non-identical data strings
US8046372B1 (en) 2007-05-25 2011-10-25 Amazon Technologies, Inc. Duplicate entry detection system and method
US7908279B1 (en) 2007-05-25 2011-03-15 Amazon Technologies, Inc. Filtering invalid tokens from a document using high IDF token filtering
US20080301033A1 (en) * 2007-06-01 2008-12-04 Netseer, Inc. Method and apparatus for optimizing long term revenues in online auctions
US8117223B2 (en) 2007-09-07 2012-02-14 Google Inc. Integrating external related phrase information into a phrase-based indexing information retrieval system
US20090112843A1 (en) * 2007-10-29 2009-04-30 International Business Machines Corporation System and method for providing differentiated service levels for search index
US20090119572A1 (en) * 2007-11-02 2009-05-07 Marja-Riitta Koivunen Systems and methods for finding information resources
US7895225B1 (en) * 2007-12-06 2011-02-22 Amazon Technologies, Inc. Identifying potential duplicates of a document in a document corpus
WO2009091917A1 (en) * 2008-01-15 2009-07-23 Kingsley Martin Processing of phrases and clauses in documents
US8752184B1 (en) 2008-01-17 2014-06-10 Google Inc. Spam detection for user-generated multimedia items based on keyword stuffing
US8745056B1 (en) 2008-03-31 2014-06-03 Google Inc. Spam detection for user-generated multimedia items based on concept clustering
WO2009111631A1 (en) * 2008-03-05 2009-09-11 Chacha Search, Inc. Method and system for triggering a search request
US8171020B1 (en) * 2008-03-31 2012-05-01 Google Inc. Spam detection for user-generated multimedia items based on appearance in popular queries
US8219385B2 (en) * 2008-04-08 2012-07-10 Incentive Targeting, Inc. Computer-implemented method and system for conducting a search of electronically stored information
US10387892B2 (en) * 2008-05-06 2019-08-20 Netseer, Inc. Discovering relevant concept and context for content node
US8788476B2 (en) * 2008-08-15 2014-07-22 Chacha Search, Inc. Method and system of triggering a search request
US8417695B2 (en) * 2008-10-30 2013-04-09 Netseer, Inc. Identifying related concepts of URLs and domain names
GB2472250A (en) * 2009-07-31 2011-02-02 Stephen Timothy Morris Method for determining document relevance
US8713078B2 (en) 2009-08-13 2014-04-29 Samsung Electronics Co., Ltd. Method for building taxonomy of topics and categorizing videos
US8838576B2 (en) * 2009-10-12 2014-09-16 Yahoo! Inc. Posting list intersection parallelism in query processing
US8543381B2 (en) * 2010-01-25 2013-09-24 Holovisions LLC Morphing text by splicing end-compatible segments
US20110202521A1 (en) * 2010-01-28 2011-08-18 Jason Coleman Enhanced database search features and methods
US8392175B2 (en) * 2010-02-01 2013-03-05 Stratify, Inc. Phrase-based document clustering with automatic phrase extraction
US8150859B2 (en) * 2010-02-05 2012-04-03 Microsoft Corporation Semantic table of contents for search results
US8161073B2 (en) 2010-05-05 2012-04-17 Holovisions, LLC Context-driven search
US8639773B2 (en) * 2010-06-17 2014-01-28 Microsoft Corporation Discrepancy detection for web crawling
US8655648B2 (en) * 2010-09-01 2014-02-18 Microsoft Corporation Identifying topically-related phrases in a browsing sequence
US9069842B2 (en) * 2010-09-28 2015-06-30 The Mitre Corporation Accessing documents using predictive word sequences
US9015043B2 (en) 2010-10-01 2015-04-21 Google Inc. Choosing recognized text from a background environment
US8478704B2 (en) 2010-11-22 2013-07-02 Microsoft Corporation Decomposable ranking for efficient precomputing that selects preliminary ranking features comprising static ranking features and dynamic atom-isolated components
US9424351B2 (en) 2010-11-22 2016-08-23 Microsoft Technology Licensing, Llc Hybrid-distribution model for search engine indexes
US8620907B2 (en) * 2010-11-22 2013-12-31 Microsoft Corporation Matching funnel for large document index
US9342582B2 (en) 2010-11-22 2016-05-17 Microsoft Technology Licensing, Llc Selection of atoms for search engine retrieval
US9195745B2 (en) 2010-11-22 2015-11-24 Microsoft Technology Licensing, Llc Dynamic query master agent for query execution
US9529908B2 (en) 2010-11-22 2016-12-27 Microsoft Technology Licensing, Llc Tiering of posting lists in search engine index
EP2646964A4 (de) 2010-12-01 2015-06-03 Google Inc Empfehlungen auf basis topischer cluster
US8626781B2 (en) * 2010-12-29 2014-01-07 Microsoft Corporation Priority hash index
US8880517B2 (en) 2011-02-18 2014-11-04 Microsoft Corporation Propagating signals across a web graph
US8332415B1 (en) * 2011-03-16 2012-12-11 Google Inc. Determining spam in information collected by a source
JP5492814B2 (ja) * 2011-03-28 2014-05-14 デジタルア−ツ株式会社 検索装置、検索システム、方法およびプログラム
US8849835B1 (en) * 2011-05-10 2014-09-30 Google Inc. Reconciling data
US8782058B2 (en) * 2011-10-12 2014-07-15 Desire2Learn Incorporated Search index dictionary
US8909643B2 (en) * 2011-12-09 2014-12-09 International Business Machines Corporation Inferring emerging and evolving topics in streaming text
US8868536B1 (en) 2012-01-04 2014-10-21 Google Inc. Real time map spam detection
US8892422B1 (en) 2012-07-09 2014-11-18 Google Inc. Phrase identification in a sequence of words
US10311085B2 (en) 2012-08-31 2019-06-04 Netseer, Inc. Concept-level user intent profile extraction and applications
US9529907B2 (en) * 2012-12-31 2016-12-27 Google Inc. Hold back and real time ranking of results in a streaming matching system
US9501506B1 (en) 2013-03-15 2016-11-22 Google Inc. Indexing system
US9483568B1 (en) 2013-06-05 2016-11-01 Google Inc. Indexing system
IN2013MU02217A (de) * 2013-07-01 2015-06-12 Tata Consultancy Services Ltd
US10565188B2 (en) * 2014-05-30 2020-02-18 Macy's West Stores, Inc. System and method for performing a pattern matching search
US10275485B2 (en) 2014-06-10 2019-04-30 Google Llc Retrieving context from previous sessions
US10347240B2 (en) * 2015-02-26 2019-07-09 Nantmobile, Llc Kernel-based verbal phrase splitting devices and methods
US10229219B2 (en) * 2015-05-01 2019-03-12 Facebook, Inc. Systems and methods for demotion of content items in a feed
US11392582B2 (en) * 2015-10-15 2022-07-19 Sumo Logic, Inc. Automatic partitioning
RU2618374C1 (ru) 2015-11-05 2017-05-03 Общество с ограниченной ответственностью "Аби ИнфоПоиск" Выявление словосочетаний в текстах на естественном языке
KR102025819B1 (ko) * 2017-03-29 2019-09-26 중앙대학교 산학협력단 사용자 생성 콘텐츠의 동적 용어 식별 체계 구축 장치 및 방법
CN107391554B (zh) * 2017-06-07 2021-10-01 中国人民解放军国防科学技术大学 高效分布式局部敏感哈希方法
JP7520500B2 (ja) * 2019-12-09 2024-07-23 株式会社東芝 データ生成装置およびデータ生成方法
CN113312898B (zh) * 2020-02-26 2024-03-01 深信服科技股份有限公司 语料处理方法、设备、存储介质及装置
US12045294B2 (en) 2020-11-16 2024-07-23 Microsoft Technology Licensing, Llc Mask-augmented inverted index
US11593439B1 (en) * 2022-05-23 2023-02-28 Onetrust Llc Identifying similar documents in a file repository using unique document signatures

Family Cites Families (160)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5321833A (en) 1990-08-29 1994-06-14 Gte Laboratories Incorporated Adaptive ranking system for information retrieval
US5278980A (en) * 1991-08-16 1994-01-11 Xerox Corporation Iterative technique for phrase query formation and an information retrieval system employing same
US5523946A (en) 1992-02-11 1996-06-04 Xerox Corporation Compact encoding of multi-lingual translation dictionaries
US5353401A (en) * 1992-11-06 1994-10-04 Ricoh Company, Ltd. Automatic interface layout generator for database systems
JPH0756933A (ja) 1993-06-24 1995-03-03 Xerox Corp 文書検索方法
US5692176A (en) 1993-11-22 1997-11-25 Reed Elsevier Inc. Associative text search and retrieval system
US5715443A (en) * 1994-07-25 1998-02-03 Apple Computer, Inc. Method and apparatus for searching for information in a data processing system and for providing scheduled search reports in a summary format
JP3669016B2 (ja) * 1994-09-30 2005-07-06 株式会社日立製作所 文書情報分類装置
US5694593A (en) * 1994-10-05 1997-12-02 Northeastern University Distributed computer database system and method
US5758257A (en) 1994-11-29 1998-05-26 Herz; Frederick System and method for scheduling broadcast of and access to video programs and other data using customer profiles
US6460036B1 (en) 1994-11-29 2002-10-01 Pinpoint Incorporated System and method for providing customized electronic newspapers and target advertisements
US5745602A (en) * 1995-05-01 1998-04-28 Xerox Corporation Automatic method of selecting multi-word key phrases from a document
US5659732A (en) * 1995-05-17 1997-08-19 Infoseek Corporation Document retrieval over networks wherein ranking and relevance scores are computed at the client for multiple database documents
US5724571A (en) 1995-07-07 1998-03-03 Sun Microsystems, Inc. Method and apparatus for generating query responses in a computer-based document retrieval system
US5668987A (en) * 1995-08-31 1997-09-16 Sybase, Inc. Database system with subquery optimizer
US6366933B1 (en) 1995-10-27 2002-04-02 At&T Corp. Method and apparatus for tracking and viewing changes on the web
US5757917A (en) * 1995-11-01 1998-05-26 First Virtual Holdings Incorporated Computerized payment system for purchasing goods and services on the internet
US6098034A (en) 1996-03-18 2000-08-01 Expert Ease Development, Ltd. Method for standardizing phrasing in a document
US5924108A (en) 1996-03-29 1999-07-13 Microsoft Corporation Document summarizer for word processors
US7051024B2 (en) 1999-04-08 2006-05-23 Microsoft Corporation Document summarizer for word processors
US5721897A (en) * 1996-04-09 1998-02-24 Rubinstein; Seymour I. Browse by prompted keyword phrases with an improved user interface
US5826261A (en) 1996-05-10 1998-10-20 Spencer; Graham System and method for querying multiple, distributed databases by selective sharing of local relative significance information for terms related to the query
US5915249A (en) 1996-06-14 1999-06-22 Excite, Inc. System and method for accelerated query evaluation of very large full-text databases
EP0822502A1 (de) 1996-07-31 1998-02-04 BRITISH TELECOMMUNICATIONS public limited company System zum Zugriff auf Daten
US5920854A (en) 1996-08-14 1999-07-06 Infoseek Corporation Real-time document collection search engine with phrase indexing
US6085186A (en) 1996-09-20 2000-07-04 Netbot, Inc. Method and system using information written in a wrapper description language to execute query on a network
US20030093790A1 (en) * 2000-03-28 2003-05-15 Logan James D. Audio and video program recording, editing and playback systems using metadata
JP3584848B2 (ja) * 1996-10-31 2004-11-04 富士ゼロックス株式会社 文書処理装置、項目検索装置及び項目検索方法
US5960383A (en) 1997-02-25 1999-09-28 Digital Equipment Corporation Extraction of key sections from texts using automatic indexing techniques
US6055540A (en) 1997-06-13 2000-04-25 Sun Microsystems, Inc. Method and apparatus for creating a category hierarchy for classification of documents
US6470307B1 (en) 1997-06-23 2002-10-22 National Research Council Of Canada Method and apparatus for automatically identifying keywords within a document
US5845278A (en) * 1997-09-12 1998-12-01 Inioseek Corporation Method for automatically selecting collections to search in full text searches
US6018733A (en) * 1997-09-12 2000-01-25 Infoseek Corporation Methods for iteratively and interactively performing collection selection in full text searches
US5983216A (en) * 1997-09-12 1999-11-09 Infoseek Corporation Performing automated document collection and selection by providing a meta-index with meta-index values indentifying corresponding document collections
US5956722A (en) 1997-09-23 1999-09-21 At&T Corp. Method for effective indexing of partially dynamic documents
US6542888B2 (en) 1997-11-26 2003-04-01 International Business Machines Corporation Content filtering for electronic documents generated in multiple foreign languages
JP4183311B2 (ja) 1997-12-22 2008-11-19 株式会社リコー 文書の注釈方法、注釈装置および記録媒体
US6185558B1 (en) 1998-03-03 2001-02-06 Amazon.Com, Inc. Identifying the items most relevant to a current query based on items selected in connection with similar queries
JP3664874B2 (ja) 1998-03-28 2005-06-29 松下電器産業株式会社 文書検索装置
US6638314B1 (en) * 1998-06-26 2003-10-28 Microsoft Corporation Method of web crawling utilizing crawl numbers
US6363377B1 (en) 1998-07-30 2002-03-26 Sarnoff Corporation Search data processor
US6377949B1 (en) 1998-09-18 2002-04-23 Tacit Knowledge Systems, Inc. Method and apparatus for assigning a confidence level to a term within a user knowledge profile
US6366911B1 (en) * 1998-09-28 2002-04-02 International Business Machines Corporation Partitioning of sorted lists (containing duplicate entries) for multiprocessors sort and merge
US6415283B1 (en) 1998-10-13 2002-07-02 Orack Corporation Methods and apparatus for determining focal points of clusters in a tree structure
US7058589B1 (en) 1998-12-17 2006-06-06 Iex Corporation Method and system for employee work scheduling
US6862710B1 (en) 1999-03-23 2005-03-01 Insightful Corporation Internet navigation using soft hyperlinks
JP4021583B2 (ja) 1999-04-08 2007-12-12 富士通株式会社 情報検索装置、情報検索方法、及びその方法を実現するプログラムを記録した記録媒体
US6430539B1 (en) * 1999-05-06 2002-08-06 Hnc Software Predictive modeling of consumer financial behavior
US7089236B1 (en) 1999-06-24 2006-08-08 Search 123.Com, Inc. Search engine interface
US6601026B2 (en) 1999-09-17 2003-07-29 Discern Communications, Inc. Information retrieval by natural language querying
US6996775B1 (en) 1999-10-29 2006-02-07 Verizon Laboratories Inc. Hypervideo: information retrieval using time-related multimedia:
US6684183B1 (en) 1999-12-06 2004-01-27 Comverse Ltd. Generic natural language service creation environment
US6963867B2 (en) 1999-12-08 2005-11-08 A9.Com, Inc. Search query processing to provide category-ranked presentation of search results
US6772150B1 (en) 1999-12-10 2004-08-03 Amazon.Com, Inc. Search query refinement using related search phrases
CA2293064C (en) 1999-12-22 2004-05-04 Ibm Canada Limited-Ibm Canada Limitee Method and apparatus for analyzing data retrieval using index scanning
US6981040B1 (en) 1999-12-28 2005-12-27 Utopy, Inc. Automatic, personalized online information and product services
US6820237B1 (en) 2000-01-21 2004-11-16 Amikanow! Corporation Apparatus and method for context-based highlighting of an electronic document
US6883135B1 (en) 2000-01-28 2005-04-19 Microsoft Corporation Proxy server using a statistical model
US6654739B1 (en) 2000-01-31 2003-11-25 International Business Machines Corporation Lightweight document clustering
US6571240B1 (en) 2000-02-02 2003-05-27 Chi Fai Ho Information processing for searching categorizing information in a document based on a categorization hierarchy and extracted phrases
US7137065B1 (en) 2000-02-24 2006-11-14 International Business Machines Corporation System and method for classifying electronically posted documents
US20060143714A1 (en) 2000-03-09 2006-06-29 Pkware, Inc. System and method for manipulating and managing computer archive files
US6859800B1 (en) 2000-04-26 2005-02-22 Global Information Research And Technologies Llc System for fulfilling an information need
AU2001261506A1 (en) 2000-05-11 2001-11-20 University Of Southern California Discourse parsing and summarization
US6691106B1 (en) 2000-05-23 2004-02-10 Intel Corporation Profile driven instant web portal
WO2001098942A2 (en) 2000-06-19 2001-12-27 Lernout & Hauspie Speech Products N.V. Package driven parsing using structure function grammar
US20020078090A1 (en) 2000-06-30 2002-06-20 Hwang Chung Hee Ontological concept-based, user-centric text summarization
EP1182577A1 (de) 2000-08-18 2002-02-27 SER Systeme AG Produkte und Anwendungen der Datenverarbeitung Assoziativspeicher
KR100426382B1 (ko) 2000-08-23 2004-04-08 학교법인 김포대학 엔트로피 정보와 베이지안 에스오엠을 이용한 문서군집기반의 순위조정 방법
US7017114B2 (en) 2000-09-20 2006-03-21 International Business Machines Corporation Automatic correlation method for generating summaries for text documents
US20020147578A1 (en) 2000-09-29 2002-10-10 Lingomotors, Inc. Method and system for query reformulation for searching of information
US20020065857A1 (en) 2000-10-04 2002-05-30 Zbigniew Michalewicz System and method for analysis and clustering of documents for search engine
CA2322599A1 (en) * 2000-10-06 2002-04-06 Ibm Canada Limited-Ibm Canada Limitee System and method for workflow control of contractual activities
JP2002132789A (ja) 2000-10-19 2002-05-10 Hitachi Ltd 文書検索方法
JP2002169834A (ja) * 2000-11-20 2002-06-14 Hewlett Packard Co <Hp> 文書のベクトル解析を行うコンピュータおよび方法
US20020091671A1 (en) 2000-11-23 2002-07-11 Andreas Prokoph Method and system for data retrieval in large collections of data
JP2002207760A (ja) 2001-01-10 2002-07-26 Hitachi Ltd 文書検索方法及びその実施装置並びにその処理プログラムを記録した記録媒体
US6778980B1 (en) 2001-02-22 2004-08-17 Drugstore.Com Techniques for improved searching of electronically stored information
US6741984B2 (en) 2001-02-23 2004-05-25 General Electric Company Method, system and storage medium for arranging a database
US6697793B2 (en) 2001-03-02 2004-02-24 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for generating phrases from a database
US6741981B2 (en) * 2001-03-02 2004-05-25 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration (Nasa) System, method and apparatus for conducting a phrase search
US6823333B2 (en) 2001-03-02 2004-11-23 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for conducting a keyterm search
US6721728B2 (en) 2001-03-02 2004-04-13 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration System, method and apparatus for discovering phrases in a database
US7194483B1 (en) 2001-05-07 2007-03-20 Intelligenxia, Inc. Method, system, and computer program product for concept-based multi-dimensional analysis of unstructured information
US7171619B1 (en) 2001-07-05 2007-01-30 Sun Microsystems, Inc. Methods and apparatus for accessing document content
US6769016B2 (en) * 2001-07-26 2004-07-27 Networks Associates Technology, Inc. Intelligent SPAM detection system using an updateable neural analysis engine
WO2003014975A1 (en) * 2001-08-08 2003-02-20 Quiver, Inc. Document categorization engine
US20030031996A1 (en) * 2001-08-08 2003-02-13 Adam Robinson Method and system for evaluating documents
US6778979B2 (en) 2001-08-13 2004-08-17 Xerox Corporation System for automatically generating queries
US6978274B1 (en) * 2001-08-31 2005-12-20 Attenex Corporation System and method for dynamically evaluating latent concepts in unstructured documents
JP2003242176A (ja) * 2001-12-13 2003-08-29 Sony Corp 情報処理装置および方法、記録媒体、並びにプログラム
US7356527B2 (en) 2001-12-19 2008-04-08 International Business Machines Corporation Lossy index compression
US6741982B2 (en) 2001-12-19 2004-05-25 Cognos Incorporated System and method for retrieving data from a database system
US7137062B2 (en) 2001-12-28 2006-11-14 International Business Machines Corporation System and method for hierarchical segmentation with latent semantic indexing in scale space
US7243092B2 (en) 2001-12-28 2007-07-10 Sap Ag Taxonomy generation for electronic documents
US7139756B2 (en) 2002-01-22 2006-11-21 International Business Machines Corporation System and method for detecting duplicate and similar documents
US7028045B2 (en) 2002-01-25 2006-04-11 International Business Machines Corporation Compressing index files in information retrieval
US7421660B2 (en) 2003-02-04 2008-09-02 Cataphora, Inc. Method and apparatus to visually present discussions for data mining purposes
JP4092933B2 (ja) 2002-03-20 2008-05-28 富士ゼロックス株式会社 文書情報検索装置及び文書情報検索プログラム
US7743045B2 (en) * 2005-08-10 2010-06-22 Google Inc. Detecting spam related and biased contexts for programmable search engines
US20030195937A1 (en) * 2002-04-16 2003-10-16 Kontact Software Inc. Intelligent message screening
US6877001B2 (en) * 2002-04-25 2005-04-05 Mitsubishi Electric Research Laboratories, Inc. Method and system for retrieving documents with spoken queries
NZ518744A (en) 2002-05-03 2004-08-27 Hyperbolex Ltd Electronic document indexing using word use nodes, node objects and link objects
US7085771B2 (en) * 2002-05-17 2006-08-01 Verity, Inc System and method for automatically discovering a hierarchy of concepts from a corpus of documents
US7028026B1 (en) 2002-05-28 2006-04-11 Ask Jeeves, Inc. Relevancy-based database retrieval and display techniques
CN1464413A (zh) * 2002-06-13 2003-12-31 英业达股份有限公司 以网页为基础的提案数据检索管理系统及其方法
JP4452012B2 (ja) 2002-07-04 2010-04-21 ヒューレット・パッカード・カンパニー 文書の特有性評価方法
JP2004046438A (ja) 2002-07-10 2004-02-12 Nippon Telegr & Teleph Corp <Ntt> テキスト検索方法及び装置及びテキスト検索プログラム及びテキスト検索プログラムを格納した記憶媒体
US20040034633A1 (en) 2002-08-05 2004-02-19 Rickard John Terrell Data search system and method using mutual subsethood measures
EP1391640B1 (de) * 2002-08-23 2005-02-23 FESTO AG & Co Dichtungsring
US7151864B2 (en) 2002-09-18 2006-12-19 Hewlett-Packard Development Company, L.P. Information research initiated from a scanned image media
US7158983B2 (en) 2002-09-23 2007-01-02 Battelle Memorial Institute Text analysis technique
US6886010B2 (en) 2002-09-30 2005-04-26 The United States Of America As Represented By The Secretary Of The Navy Method for data and text mining and literature-based discovery
JP2004139150A (ja) 2002-10-15 2004-05-13 Ricoh Co Ltd 文書検索装置、プログラム及び記憶媒体
US7970832B2 (en) * 2002-11-20 2011-06-28 Return Path, Inc. Electronic message delivery with estimation approaches and complaint, bond, and statistics panels
US20040133560A1 (en) 2003-01-07 2004-07-08 Simske Steven J. Methods and systems for organizing electronic documents
US7725544B2 (en) * 2003-01-24 2010-05-25 Aol Inc. Group based spam classification
GB2399427A (en) * 2003-03-12 2004-09-15 Canon Kk Apparatus for and method of summarising text
US7945567B2 (en) 2003-03-17 2011-05-17 Hewlett-Packard Development Company, L.P. Storing and/or retrieving a document within a knowledge base or document repository
US6947930B2 (en) 2003-03-21 2005-09-20 Overture Services, Inc. Systems and methods for interactive search query refinement
US7051023B2 (en) * 2003-04-04 2006-05-23 Yahoo! Inc. Systems and methods for generating concept units from search queries
US7149748B1 (en) 2003-05-06 2006-12-12 Sap Ag Expanded inverted index
WO2004107213A1 (en) * 2003-05-31 2004-12-09 Nhn Corporation A method of managing websites registered in search engine and a system thereof
US7272853B2 (en) * 2003-06-04 2007-09-18 Microsoft Corporation Origination/destination features and lists for spam prevention
US7051014B2 (en) 2003-06-18 2006-05-23 Microsoft Corporation Utilizing information redundancy to improve text searches
US7254580B1 (en) 2003-07-31 2007-08-07 Google Inc. System and method for selectively searching partitions of a database
JP2005056233A (ja) * 2003-08-06 2005-03-03 Nec Corp 移動体通信装置、移動体通信装置の電子メールの受信動作方法及びその電子メールの受信動作プログラム
US20050043940A1 (en) 2003-08-20 2005-02-24 Marvin Elder Preparing a data source for a natural language query
US20050060295A1 (en) * 2003-09-12 2005-03-17 Sensory Networks, Inc. Statistical classification of high-speed network data through content inspection
US20050071310A1 (en) 2003-09-30 2005-03-31 Nadav Eiron System, method, and computer program product for identifying multi-page documents in hypertext collections
US7346839B2 (en) * 2003-09-30 2008-03-18 Google Inc. Information retrieval based on historical data
US20050071328A1 (en) 2003-09-30 2005-03-31 Lawrence Stephen R. Personalization of web search
US7257564B2 (en) * 2003-10-03 2007-08-14 Tumbleweed Communications Corp. Dynamic message filtering
US7240064B2 (en) 2003-11-10 2007-07-03 Overture Services, Inc. Search engine with hierarchically stored indices
CN100495392C (zh) 2003-12-29 2009-06-03 西安迪戈科技有限责任公司 一种智能搜索方法
US7206389B1 (en) 2004-01-07 2007-04-17 Nuance Communications, Inc. Method and apparatus for generating a speech-recognition-based call-routing system
US20060294124A1 (en) 2004-01-12 2006-12-28 Junghoo Cho Unbiased page ranking
US20050198559A1 (en) 2004-03-08 2005-09-08 Kabushiki Kaisha Toshiba Document information management system, document information management program, and document information management method
US20050216564A1 (en) 2004-03-11 2005-09-29 Myers Gregory K Method and apparatus for analysis of electronic communications containing imagery
US20050256848A1 (en) 2004-05-13 2005-11-17 International Business Machines Corporation System and method for user rank search
US7155243B2 (en) 2004-06-15 2006-12-26 Tekelec Methods, systems, and computer program products for content-based screening of messaging service messages
JP2006026844A (ja) * 2004-07-20 2006-02-02 Fujitsu Ltd ポリッシングパッド、それを備えた研磨装置及び貼り付け装置
US7567959B2 (en) 2004-07-26 2009-07-28 Google Inc. Multiple index based information retrieval system
US7702618B1 (en) 2004-07-26 2010-04-20 Google Inc. Information retrieval system for archiving multiple document versions
US7599914B2 (en) 2004-07-26 2009-10-06 Google Inc. Phrase-based searching in an information retrieval system
US7584175B2 (en) 2004-07-26 2009-09-01 Google Inc. Phrase-based generation of document descriptions
US7711679B2 (en) 2004-07-26 2010-05-04 Google Inc. Phrase-based detection of duplicate documents in an information retrieval system
US7580921B2 (en) * 2004-07-26 2009-08-25 Google Inc. Phrase identification in an information retrieval system
US7426507B1 (en) 2004-07-26 2008-09-16 Google, Inc. Automatic taxonomy generation in search results using phrases
US7580929B2 (en) 2004-07-26 2009-08-25 Google Inc. Phrase-based personalization of searches in an information retrieval system
US7536408B2 (en) 2004-07-26 2009-05-19 Google Inc. Phrase-based indexing in an information retrieval system
US8407239B2 (en) * 2004-08-13 2013-03-26 Google Inc. Multi-stage query processing system and method for use with tokenspace repository
US8504565B2 (en) 2004-09-09 2013-08-06 William M. Pitts Full text search capabilities integrated into distributed file systems— incrementally indexing files
US7962510B2 (en) * 2005-02-11 2011-06-14 Microsoft Corporation Using content analysis to detect spam web pages
US20060200464A1 (en) 2005-03-03 2006-09-07 Microsoft Corporation Method and system for generating a document summary
US7552230B2 (en) * 2005-06-15 2009-06-23 International Business Machines Corporation Method and apparatus for reducing spam on peer-to-peer networks
US20080005064A1 (en) * 2005-06-28 2008-01-03 Yahoo! Inc. Apparatus and method for content annotation and conditional annotation retrieval in a search context
US7454449B2 (en) 2005-12-20 2008-11-18 International Business Machines Corporation Method for reorganizing a set of database partitions
WO2007123919A2 (en) 2006-04-18 2007-11-01 Gemini Design Technology, Inc. Method for ranking webpages via circuit simulation
US8117223B2 (en) 2007-09-07 2012-02-14 Google Inc. Integrating external related phrase information into a phrase-based indexing information retrieval system

Also Published As

Publication number Publication date
CA2513850A1 (en) 2006-01-26
US8078629B2 (en) 2011-12-13
US7580921B2 (en) 2009-08-25
NO20053638L (no) 2006-01-27
EP1622053B1 (de) 2011-03-02
KR101190230B1 (ko) 2012-10-12
CA2513850C (en) 2014-01-14
JP4976666B2 (ja) 2012-07-18
CN1728142A (zh) 2006-02-01
CN1728142B (zh) 2011-01-12
JP2006048683A (ja) 2006-02-16
KR20060048779A (ko) 2006-05-18
EP1622053A1 (de) 2006-02-01
AU2005203240B2 (en) 2009-01-22
AU2005203240A1 (en) 2006-02-09
US20060294155A1 (en) 2006-12-28
DE602005026609D1 (de) 2011-04-14
BRPI0503781A (pt) 2006-03-14
US7603345B2 (en) 2009-10-13
US20110131223A1 (en) 2011-06-02
US20060018551A1 (en) 2006-01-26
NO20053638D0 (no) 2005-07-26

Similar Documents

Publication Publication Date Title
ATE500559T1 (de) Identifizierung von ausdrücken in einem informationsabfragesystem
ATE521946T1 (de) Phrasen basiertes suchen in einem system zur informationsabfrage
ATE521947T1 (de) Phrasen basiertes indizieren in einem system zur informationsabfrage
ATE529811T1 (de) Phrasen-basierte generierung von dokumentbeschreibungen
WO2006081325A3 (en) Multiple index based information retrieval system
CA2677307A1 (en) Searching structured geographical data
BRPI0506675A (pt) sistema, métodos, interfaces e software para estender resultados de busca além dos limites definidos pela consulta inicial
RU2010107150A (ru) Идентификация семантических отношений в косвенной речи
WO2006041950A3 (en) Classification-expanded indexing and retrieval of classified documents
MXPA05007079A (es) Diseminacion de resultados de motor de busqueda utilizando informacion de categoria de pagina.
WO2007019311A3 (en) Systems for and methods of finding relevant documents by analyzing tags
WO2007002412A3 (en) Systems and methods for retrieving data
RU2008145584A (ru) Аннотация посредством поиска
JP2010538375A5 (de)
Benkoussas et al. Collaborative Filtering for Book Recommandation.
Arampatzis et al. Fusion vs. two-stage for multimodal retrieval
Rastegar-Mojarad et al. Semantic information retrieval: exploring dependency and word embedding features in biomedical information retrieval
Abdulahhad et al. Matching fusion with conceptual indexing
Jadi et al. Evaluating Lexical Similarity to build Sentiment Similarity
Wang et al. PRIS at TREC 2010: Related Entity Finding Task of Entity Track.
WO2009011068A1 (ja) 入力支援システム
Zin et al. Structured XML Retrieval System using Relevance Propagation
van Beek The forge and the funeral
Bo-Hyun et al. Concept and Attribute based Answer Retrieval
ROHAN ft Ju. a at'a Tatr ie

Legal Events

Date Code Title Description
RER Ceased as to paragraph 5 lit. 3 law introducing patent treaties