WO2007062215A3 - Method, system and code for retrieving texts - Google Patents

Method, system and code for retrieving texts Download PDF

Info

Publication number
WO2007062215A3
WO2007062215A3 PCT/US2006/045397 US2006045397W WO2007062215A3 WO 2007062215 A3 WO2007062215 A3 WO 2007062215A3 US 2006045397 W US2006045397 W US 2006045397W WO 2007062215 A3 WO2007062215 A3 WO 2007062215A3
Authority
WO
WIPO (PCT)
Prior art keywords
word
texts
code
query
retrieving
Prior art date
Application number
PCT/US2006/045397
Other languages
French (fr)
Other versions
WO2007062215A2 (en
Inventor
Peter J Dehlinger
Original Assignee
Word Data Corp
Peter J Dehlinger
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Word Data Corp, Peter J Dehlinger filed Critical Word Data Corp
Publication of WO2007062215A2 publication Critical patent/WO2007062215A2/en
Publication of WO2007062215A3 publication Critical patent/WO2007062215A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A computer-assisted method, code, and system for use in retrieving one or more selected texts from a collection of texts, are disclosed. The method employs a word-affinity matrix for use in constructing a search vector composed of a plurality of vector terms, each term containing a query word and a coefficient for that query word related to the inverse of the sum of all P(Wqm|Wqn), for all other query words Wqn, where P(Wm|Wn) is the conditional probability of finding word Wm in a text containing word Wn, within a collection of texts.
PCT/US2006/045397 2005-11-22 2006-11-22 Method, system and code for retrieving texts WO2007062215A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US73927205P 2005-11-22 2005-11-22
US60/739,272 2005-11-22

Publications (2)

Publication Number Publication Date
WO2007062215A2 WO2007062215A2 (en) 2007-05-31
WO2007062215A3 true WO2007062215A3 (en) 2007-12-13

Family

ID=38067955

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2006/045397 WO2007062215A2 (en) 2005-11-22 2006-11-22 Method, system and code for retrieving texts

Country Status (1)

Country Link
WO (1) WO2007062215A2 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615723B (en) * 2015-02-06 2018-08-07 百度在线网络技术(北京)有限公司 The determination method and apparatus of query word weighted value

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050120011A1 (en) * 2003-11-26 2005-06-02 Word Data Corp. Code, method, and system for manipulating texts

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050120011A1 (en) * 2003-11-26 2005-06-02 Word Data Corp. Code, method, and system for manipulating texts

Also Published As

Publication number Publication date
WO2007062215A2 (en) 2007-05-31

Similar Documents

Publication Publication Date Title
Solé et al. Diversity, competition, extinction: the ecophysics of language change
WO2004066062A3 (en) A system and method for providing content warehouse
RU2004129675A (en) SYSTEM FOR IDENTIFICATION OF REFRACTION USING MACHINE TRANSLATION TECHNOLOGY
WO2004084099A3 (en) Corpus clustering, confidence refinement, and ranking for geographic text search and information retrieval
WO2007148128A3 (en) A data entry system and method of entering data
WO2008070877A3 (en) Online computer-aided translation
CN108717410B (en) Named entity identification method and system
ATE401609T1 (en) LEXICON WITH DESCRIBED DATA AND PROCEDURES FOR THEIR CONSTRUCTION AND USE
WO2004086192A3 (en) Systems and methods for interactive search query refinement
WO2005017765A3 (en) Parallel processing array
WO2005050370A3 (en) System and method of searching for image data in a storage medium
GB2463221A (en) Biological database index and query searching
WO2007062215A3 (en) Method, system and code for retrieving texts
WO2005106700A3 (en) Set based data store
WO2008114086A3 (en) Combined data entry systems
WO2005031602A3 (en) Method for organising a database
Chen et al. A two-stage approach to Chinese part-of-speech tagging
Rankine et al. Intentional use of Te reo Maori in New Zealand newspapers in 2007
Santoro et al. Italian Sign Language (LIS) Corpus
Chowdhury A simple yet effective approach for named entity recognition from transcribed broadcast news
Day CPD scheme launched.
Young Hume, Patrick, first earl of Marchmont (1641–1724)
Ahlava Interview with Antti Ahlava
WO2008022307A3 (en) Systems and methods for implementing a double precision arithmetic memory architecture
CN100501741C (en) Full text enquiring method and device thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06838392

Country of ref document: EP

Kind code of ref document: A2