Connect public, paid and private patent data with Google Patents Public Datasets

Information exploration systems and methods

Info

Publication number
WO2007059225A3
WO2007059225A3 PCT/US2006/044367 US2006044367W WO2007059225A3 WO 2007059225 A3 WO2007059225 A3 WO 2007059225A3 US 2006044367 W US2006044367 W US 2006044367W WO 2007059225 A3 WO2007059225 A3 WO 2007059225A3
Authority
WO
Grant status
Application
Patent type
Prior art keywords
cluster
information
exploration
phrases
representative
Prior art date
Application number
PCT/US2006/044367
Other languages
French (fr)
Other versions
WO2007059225A2 (en )
Inventor
Kevin B Thompson
Matthew S Sommer
Original Assignee
Engenium Corp
Kevin B Thompson
Matthew S Sommer
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor ; File system structures therefor
    • G06F17/3061Information retrieval; Database structures therefor ; File system structures therefor of unstructured textual data
    • G06F17/30705Clustering or classification
    • G06F17/3071Clustering or classification including class or cluster creation or modification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access

Abstract

Disclosed information exploration system and method embodiments operate on a document set to determine a document cluster hierarchy. An exclusionary phrase index is determined for each cluster, and representative phrases are selected from the indexes. The selection process may enforce pathwise uniqueness and balanced sub-cluster representation. The representative phrases may be used as cluster labels in an interactive information exploration interface.
PCT/US2006/044367 2005-11-15 2006-11-15 Information exploration systems and methods WO2007059225A3 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/274,435 2005-11-15
US11274435 US7676463B2 (en) 2005-11-15 2005-11-15 Information exploration systems and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0810333A GB0810333D0 (en) 2005-11-15 2006-11-15 Information exploration systems and methods
CA 2629999 CA2629999C (en) 2005-11-15 2006-11-15 Information exploration systems and methods

Publications (2)

Publication Number Publication Date
WO2007059225A2 true WO2007059225A2 (en) 2007-05-24
WO2007059225A3 true true WO2007059225A3 (en) 2009-05-07

Family

ID=38042113

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/044367 WO2007059225A3 (en) 2005-11-15 2006-11-15 Information exploration systems and methods

Country Status (4)

Country Link
US (1) US7676463B2 (en)
CA (1) CA2629999C (en)
GB (1) GB0810333D0 (en)
WO (1) WO2007059225A3 (en)

Families Citing this family (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645137B2 (en) 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
US7603351B2 (en) * 2006-04-19 2009-10-13 Apple Inc. Semantic reconstruction
US8131722B2 (en) * 2006-11-20 2012-03-06 Ebay Inc. Search clustering
US20080208847A1 (en) * 2007-02-26 2008-08-28 Fabian Moerchen Relevance ranking for document retrieval
US8166045B1 (en) 2007-03-30 2012-04-24 Google Inc. Phrase extraction using subphrase scoring
US8166021B1 (en) 2007-03-30 2012-04-24 Google Inc. Query phrasification
US7693813B1 (en) 2007-03-30 2010-04-06 Google Inc. Index server architecture using tiered and sharded phrase posting lists
US8744891B1 (en) * 2007-07-26 2014-06-03 United Services Automobile Association (Usaa) Systems and methods for dynamic business decision making
US8510312B1 (en) * 2007-09-28 2013-08-13 Google Inc. Automatic metadata identification
US7814108B2 (en) * 2007-12-21 2010-10-12 Microsoft Corporation Search engine platform
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US20090240498A1 (en) * 2008-03-19 2009-09-24 Microsoft Corporation Similiarity measures for short segments of text
US8996376B2 (en) 2008-04-05 2015-03-31 Apple Inc. Intelligent text-to-speech conversion
US8676815B2 (en) * 2008-05-07 2014-03-18 City University Of Hong Kong Suffix tree similarity measure for document clustering
US20100030549A1 (en) 2008-07-31 2010-02-04 Lee Michael M Mobile device having human language translation capability with positional feedback
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US20110078144A1 (en) * 2009-09-28 2011-03-31 Oracle International Corporation Hierarchical sequential clustering
US20110074789A1 (en) * 2009-09-28 2011-03-31 Oracle International Corporation Interactive dendrogram controls
US20110078194A1 (en) * 2009-09-28 2011-03-31 Oracle International Corporation Sequential information retrieval
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US8682667B2 (en) 2010-02-25 2014-03-25 Apple Inc. User profiling for selecting user specific voice input processing information
US8713021B2 (en) 2010-07-07 2014-04-29 Apple Inc. Unsupervised document clustering using latent semantic density analysis
US8775444B2 (en) * 2010-10-29 2014-07-08 Xerox Corporation Generating a subset aggregate document from an existing aggregate document
US8751496B2 (en) 2010-11-16 2014-06-10 International Business Machines Corporation Systems and methods for phrase clustering
JP5617674B2 (en) * 2011-02-14 2014-09-26 日本電気株式会社 Article between similarity calculation device, the inter-document similarity calculation method, and, the inter-document similarity calculation program
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US8994660B2 (en) 2011-08-29 2015-03-31 Apple Inc. Text correction processing
JP5389130B2 (en) * 2011-09-15 2014-01-15 株式会社東芝 Document classification apparatus, method and program
JP5639562B2 (en) * 2011-09-30 2014-12-10 株式会社東芝 Service execution unit, the service execution method and a service execution program
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US8700583B1 (en) 2012-07-24 2014-04-15 Google Inc. Dynamic tiermaps for large online databases
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
WO2014074917A1 (en) * 2012-11-08 2014-05-15 Cooper & Co Ltd Edwin System and method for divisive textual clustering by label selection using variant-weighted tfidf
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9116974B2 (en) * 2013-03-15 2015-08-25 Robert Bosch Gmbh System and method for clustering data in input and output spaces
WO2014144579A1 (en) 2013-03-15 2014-09-18 Apple Inc. System and method for updating an adaptive speech recognition model
US9122681B2 (en) 2013-03-15 2015-09-01 Gordon Villy Cormack Systems and methods for classifying electronic information using advanced active learning techniques
US9501506B1 (en) 2013-03-15 2016-11-22 Google Inc. Indexing system
US9483568B1 (en) 2013-06-05 2016-11-01 Google Inc. Indexing system
WO2014197336A1 (en) 2013-06-07 2014-12-11 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
WO2014197334A3 (en) 2013-06-07 2015-01-29 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
JP2016521948A (en) 2013-06-13 2016-07-25 アップル インコーポレイテッド System and method for emergency call initiated by voice command
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9606986B2 (en) 2014-09-29 2017-03-28 Apple Inc. Integrated word N-gram and class M-gram language models
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5999927A (en) * 1996-01-11 1999-12-07 Xerox Corporation Method and apparatus for information access employing overlapping clusters
US6847966B1 (en) * 2002-04-24 2005-01-25 Engenium Corporation Method and system for optimally searching a document database using a representative semantic space
US20050044487A1 (en) * 2003-08-21 2005-02-24 Apple Computer, Inc. Method and apparatus for automatic file clustering into a data-driven, user-specific taxonomy

Family Cites Families (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3220885B2 (en) 1993-06-18 2001-10-22 株式会社日立製作所 Keywords grant system
US7251637B1 (en) * 1993-09-20 2007-07-31 Fair Isaac Corporation Context vector generation and retrieval
EP0856175A4 (en) * 1995-08-16 2000-05-24 Univ Syracuse Multilingual document retrieval system and method using semantic vector matching
US5819258A (en) * 1997-03-07 1998-10-06 Digital Equipment Corporation Method and apparatus for automatically generating hierarchical categories from large document collections
US6137911A (en) * 1997-06-16 2000-10-24 The Dialog Corporation Plc Test classification system and method
US6134532A (en) * 1997-11-14 2000-10-17 Aptex Software, Inc. System and method for optimal adaptive matching of users to most relevant entity and information in real-time
US6216134B1 (en) * 1998-06-25 2001-04-10 Microsoft Corporation Method and system for visualization of clusters and classifications
US6446061B1 (en) * 1998-07-31 2002-09-03 International Business Machines Corporation Taxonomy generation for document collections
US6360227B1 (en) * 1999-01-29 2002-03-19 International Business Machines Corporation System and method for generating taxonomies with applications to content-based recommendations
WO2000046701A1 (en) * 1999-02-08 2000-08-10 Huntsman Ici Chemicals Llc Method for retrieving semantically distant analogies
US6374217B1 (en) * 1999-03-12 2002-04-16 Apple Computer, Inc. Fast update implementation for efficient latent semantic language modeling
US6408295B1 (en) * 1999-06-16 2002-06-18 International Business Machines Corporation System and method of using clustering to find personalized associations
US6438539B1 (en) * 2000-02-25 2002-08-20 Agents-4All.Com, Inc. Method for retrieving data from an information network through linking search criteria to search strategy
US6658406B1 (en) * 2000-03-29 2003-12-02 Microsoft Corporation Method for selecting terms from vocabularies in a category-based system
JP3672234B2 (en) * 2000-06-12 2005-07-20 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Retrieve rank with how the document from the database, the computer system, and a recording medium
KR100426382B1 (en) * 2000-08-23 2004-04-08 학교법인 김포대학 Method for re-adjusting ranking document based cluster depending on entropy information and Bayesian SOM(Self Organizing feature Map)
US6895406B2 (en) * 2000-08-25 2005-05-17 Seaseer R&D, Llc Dynamic personalization method of creating personalized user profiles for searching a database of information
US7039638B2 (en) * 2001-04-27 2006-05-02 Hewlett-Packard Development Company, L.P. Distributed data clustering system and method
US6742003B2 (en) * 2001-04-30 2004-05-25 Microsoft Corporation Apparatus and accompanying methods for visualizing clusters of data and hierarchical cluster classifications
US7024400B2 (en) * 2001-05-08 2006-04-04 Sunflare Co., Ltd. Differential LSI space-based probabilistic document classifier
JP3845553B2 (en) * 2001-05-25 2006-11-15 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Computer system, and a program for performing the retrieve-ranking of the documents in the database
JP3870043B2 (en) * 2001-07-05 2007-01-17 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Search of the main cluster and the outlier cluster in large-scale database, a system for the detection and identification, computer programs, and the server
US20030093411A1 (en) * 2001-11-09 2003-05-15 Minor James M. System and method for dynamic data clustering
US20030154181A1 (en) * 2002-01-25 2003-08-14 Nec Usa, Inc. Document clustering with cluster refinement and model selection capabilities
US7480628B2 (en) * 2002-01-29 2009-01-20 Netcomponents, Inc. Smart multi-search method and system
JP3860046B2 (en) * 2002-02-15 2006-12-20 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Program for information processing using a random sample hierarchical structures, systems and recording medium
US7177863B2 (en) * 2002-04-26 2007-02-13 International Business Machines Corporation System and method for determining internal parameters of a data clustering program
JP3773888B2 (en) * 2002-10-04 2006-05-10 インターナショナル・ビジネス・マシーンズ・コーポレーションInternational Business Maschines Corporation Data search system, a data search method, graphical user interface system for displaying program for executing a data search, a computer-readable storage medium storing the program, the retrieved documents to the computer, computer-executable program, and storage medium storing the program for realizing the graphical user interface
WO2004042493A3 (en) * 2002-10-24 2006-03-02 Agency Science Tech & Res Method and system for discovering knowledge from text documents
US7280957B2 (en) * 2002-12-16 2007-10-09 Palo Alto Research Center, Incorporated Method and apparatus for generating overview information for hierarchically related information
US7225184B2 (en) * 2003-07-18 2007-05-29 Overture Services, Inc. Disambiguation of search phrases using interpretation clusters
US7346629B2 (en) * 2003-10-09 2008-03-18 Yahoo! Inc. Systems and methods for search processing using superunits
US7191175B2 (en) * 2004-02-13 2007-03-13 Attenex Corporation System and method for arranging concept clusters in thematic neighborhood relationships in a two-dimensional visual display space
US20050222987A1 (en) * 2004-04-02 2005-10-06 Vadon Eric R Automated detection of associations between search criteria and item categories based on collective analysis of user activity data
US20050267872A1 (en) * 2004-06-01 2005-12-01 Yaron Galai System and method for automated mapping of items to documents
US7567959B2 (en) * 2004-07-26 2009-07-28 Google Inc. Multiple index based information retrieval system
US7580929B2 (en) * 2004-07-26 2009-08-25 Google Inc. Phrase-based personalization of searches in an information retrieval system
US7711679B2 (en) * 2004-07-26 2010-05-04 Google Inc. Phrase-based detection of duplicate documents in an information retrieval system
CN100462961C (en) * 2004-11-09 2009-02-18 国际商业机器公司 Method for organizing multi-file and equipment for displaying multi-file
US7356777B2 (en) * 2005-01-26 2008-04-08 Attenex Corporation System and method for providing a dynamic user interface for a dense three-dimensional scene
US7451124B2 (en) * 2005-05-12 2008-11-11 Xerox Corporation Method of analyzing documents
US8010480B2 (en) * 2005-09-30 2011-08-30 Google Inc. Selecting high quality text within identified reviews for display in review snippets
US20070078669A1 (en) * 2005-09-30 2007-04-05 Dave Kushal B Selecting representative reviews for display
US20070078670A1 (en) * 2005-09-30 2007-04-05 Dave Kushal B Selecting high quality reviews for display
US7558769B2 (en) * 2005-09-30 2009-07-07 Google Inc. Identifying clusters of similar reviews and displaying representative reviews from multiple clusters
US7599945B2 (en) * 2006-11-30 2009-10-06 Yahoo! Inc. Dynamic cluster visualization

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5999927A (en) * 1996-01-11 1999-12-07 Xerox Corporation Method and apparatus for information access employing overlapping clusters
US6847966B1 (en) * 2002-04-24 2005-01-25 Engenium Corporation Method and system for optimally searching a document database using a representative semantic space
US20050044487A1 (en) * 2003-08-21 2005-02-24 Apple Computer, Inc. Method and apparatus for automatic file clustering into a data-driven, user-specific taxonomy

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHUANG S.-L. ET AL.: "A Practical Web-Based Approach to Generating Topic Hierarchcy for Text Segments", CIKM'04, WASHINGTON, DC, 8 November 2004 (2004-11-08) - 13 November 2004 (2004-11-13), pages 127 - 136 *
KUMAR S. ET AL.: "Personalized Profile Based Search Interface With Ranked and Clustered Display", TECHNICAL REPORT 01-023, UNIV. OF MINNESOTA, DEPT. OF COMPUTER SCIENCE AND ENGINEERING, MINNEAPOLIS, MN, 1 June 2001 (2001-06-01), pages I - III, 1 - 18 *
ZAMIR O. ET AL.: "Clustering Web Documents: A Phrase-Based Method for Grouping Search Engine Results", DOCTOR OF PHILOSOPHY THESIS, UNIV. OF WASHINGTON, DEPT. OF COMPUTER SCIENCE AND ENGINEERING, 1999, pages INTRO-1 - INTRO-6, I - X, 1 - 192, XP055044488 *
ZAMIR O. ET AL.: "Grouper: A Dynamic Clustering Interface to Web Search Results", COMPUTER NETWORKS, vol. 31, no. 11, May 1999 (1999-05-01), pages 1 - 15, XP004304560, DOI: doi:10.1016/S1389-1286(99)00054-7 *

Also Published As

Publication number Publication date Type
WO2007059225A2 (en) 2007-05-24 application
CA2629999C (en) 2014-12-23 grant
GB2452799A (en) 2009-03-18 application
US7676463B2 (en) 2010-03-09 grant
GB0810333D0 (en) 2008-07-09 grant
US20070112755A1 (en) 2007-05-17 application
CA2629999A1 (en) 2007-05-24 application

Similar Documents

Publication Publication Date Title
Diab Second generation AMIRA tools for Arabic processing: Fast and robust tokenization, POS tagging, and base phrase chunking
Epstein et al. Derivation and explanation in the Minimalist Program
Wohlgemuth A typology of verbal borrowings
Demuynck Extracting, modelling and combining information in speech recognition
CN101286156A (en) Method for removing repeated object based on metadata
CN101446942A (en) Semantic character labeling method of natural language sentence
WO2007083371A1 (en) Data integration device, method, and recording medium containing program
WO2007062429A3 (en) Systems and methods for classifying and transferring information in a storage network
WO2002008950A8 (en) Automatic summarization of a document
Sennrich et al. Exploiting synergies between open resources for german dependency parsing, pos-tagging, and morphological analysis
Chen et al. Language planning and language policy: East Asian perspectives
WO2004034282A1 (en) Content reuse management device and content reuse support device
WO2008063974A3 (en) Method and system for high performance integration, processing and searching of structured and unstructured data using coprocessors
Gardent Integrating a unification-based semantics in a large scale Lexicalised Tree Adjoining Grammar for French
Ryu et al. Open domain question answering using Wikipedia-based knowledge model
Vikram et al. Development of prototype morphological analyzer for he south indian language of kannada
CN103186651A (en) Distributed relational database as well as method and device for building and querying same
WO2002045018A1 (en) Nuclear area recognizing method and nuclear genealogy creating method
WO2008084738A1 (en) Encoding and decoding apparatus, method, and program, and recording medium
WO2009014058A1 (en) Knowledge discovery assistance system, method and program
CN102279843A (en) The phrase method for processing data and means
Darquennes Language contact and language conflict in autochthonous language minority settings in the EU: A preliminary round-up of guiding principles and research desiderata
Bharadwaj et al. Language independent identification of parallel sentences using wikipedia
WO2005086003A1 (en) Database system
Martens et al. An efficient, generic approach to extracting multi-word expressions from dependency trees

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
ENP Entry into the national phase in:

Ref document number: 2629999

Country of ref document: CA

NENP Non-entry into the national phase in:

Ref country code: DE

ENP Entry into the national phase in:

Ref document number: 0810333

Country of ref document: GB

Kind code of ref document: A

Free format text: PCT FILING DATE = 20061115

WWE Wipo information: entry into national phase

Ref document number: 0810333.5

Country of ref document: GB

122 Ep: pct application non-entry in european phase

Ref document number: 06837688

Country of ref document: EP

Kind code of ref document: A2