WO2007062215A3 - Method, system and code for retrieving texts - Google Patents
Method, system and code for retrieving texts Download PDFInfo
- Publication number
- WO2007062215A3 WO2007062215A3 PCT/US2006/045397 US2006045397W WO2007062215A3 WO 2007062215 A3 WO2007062215 A3 WO 2007062215A3 US 2006045397 W US2006045397 W US 2006045397W WO 2007062215 A3 WO2007062215 A3 WO 2007062215A3
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- word
- texts
- code
- query
- retrieving
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/313—Selection or weighting of terms for indexing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A computer-assisted method, code, and system for use in retrieving one or more selected texts from a collection of texts, are disclosed. The method employs a word-affinity matrix for use in constructing a search vector composed of a plurality of vector terms, each term containing a query word and a coefficient for that query word related to the inverse of the sum of all P(Wqm|Wqn), for all other query words Wqn, where P(Wm|Wn) is the conditional probability of finding word Wm in a text containing word Wn, within a collection of texts.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US73927205P | 2005-11-22 | 2005-11-22 | |
US60/739,272 | 2005-11-22 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2007062215A2 WO2007062215A2 (en) | 2007-05-31 |
WO2007062215A3 true WO2007062215A3 (en) | 2007-12-13 |
Family
ID=38067955
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2006/045397 WO2007062215A2 (en) | 2005-11-22 | 2006-11-22 | Method, system and code for retrieving texts |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2007062215A2 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104615723B (en) * | 2015-02-06 | 2018-08-07 | 百度在线网络技术(北京)有限公司 | The determination method and apparatus of query word weighted value |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050120011A1 (en) * | 2003-11-26 | 2005-06-02 | Word Data Corp. | Code, method, and system for manipulating texts |
-
2006
- 2006-11-22 WO PCT/US2006/045397 patent/WO2007062215A2/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050120011A1 (en) * | 2003-11-26 | 2005-06-02 | Word Data Corp. | Code, method, and system for manipulating texts |
Also Published As
Publication number | Publication date |
---|---|
WO2007062215A2 (en) | 2007-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Solé et al. | Diversity, competition, extinction: the ecophysics of language change | |
WO2004066062A3 (en) | A system and method for providing content warehouse | |
RU2004129675A (en) | SYSTEM FOR IDENTIFICATION OF REFRACTION USING MACHINE TRANSLATION TECHNOLOGY | |
WO2004084099A3 (en) | Corpus clustering, confidence refinement, and ranking for geographic text search and information retrieval | |
WO2007148128A3 (en) | A data entry system and method of entering data | |
WO2008070877A3 (en) | Online computer-aided translation | |
CN108717410B (en) | Named entity identification method and system | |
ATE401609T1 (en) | LEXICON WITH DESCRIBED DATA AND PROCEDURES FOR THEIR CONSTRUCTION AND USE | |
WO2004086192A3 (en) | Systems and methods for interactive search query refinement | |
WO2005017765A3 (en) | Parallel processing array | |
WO2005050370A3 (en) | System and method of searching for image data in a storage medium | |
GB2463221A (en) | Biological database index and query searching | |
WO2007062215A3 (en) | Method, system and code for retrieving texts | |
WO2005106700A3 (en) | Set based data store | |
WO2008114086A3 (en) | Combined data entry systems | |
WO2005031602A3 (en) | Method for organising a database | |
Chen et al. | A two-stage approach to Chinese part-of-speech tagging | |
Rankine et al. | Intentional use of Te reo Maori in New Zealand newspapers in 2007 | |
Santoro et al. | Italian Sign Language (LIS) Corpus | |
Chowdhury | A simple yet effective approach for named entity recognition from transcribed broadcast news | |
Day | CPD scheme launched. | |
Young | Hume, Patrick, first earl of Marchmont (1641–1724) | |
Ahlava | Interview with Antti Ahlava | |
WO2008022307A3 (en) | Systems and methods for implementing a double precision arithmetic memory architecture | |
CN100501741C (en) | Full text enquiring method and device thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 06838392 Country of ref document: EP Kind code of ref document: A2 |