UST900006I4 - Information storage and retrieval system and method - Google Patents
Information storage and retrieval system and method Download PDFInfo
- Publication number
- UST900006I4 UST900006I4 US900006DH UST900006I4 US T900006 I4 UST900006 I4 US T900006I4 US 900006D H US900006D H US 900006DH US T900006 I4 UST900006 I4 US T900006I4
- Authority
- US
- United States
- Prior art keywords
- documents
- search
- stored
- occurrence
- data base
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06K—GRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
- G06K17/00—Methods or arrangements for effecting co-operative working between equipments covered by two or more of main groups G06K1/00 - G06K15/00, e.g. automatic card files incorporating conveying and reading operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/901—Indexing; Data structures therefor; Storage structures
- G06F16/9017—Indexing; Data structures therefor; Storage structures using directory or table look-up
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10S—TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10S707/00—Data processing: database and file management or data structures
- Y10S707/99931—Database or file accessing
- Y10S707/99933—Query processing, i.e. searching
- Y10S707/99935—Query augmenting and refining, e.g. inexact access
Definitions
- a probability calculator determines the likelihood of the occurrence of a search symbol at least once in a given stored document in the systems data base.
- the system and method causes the query text to be scanned so as to determine which search symbols are contained therein.
- the term overlap is used to designate a search symbol in the query and also in a. given stored document.
- the particular document in the systems data base having the smallest joint probability of occurrence of overlap search symbols is designated as having the highest relevance potential within the data base to a given query.
- the stored document having the next larger joint probability of occurrence of overlap search symbols has the next highest relevance potential. In this manner, any select number of relevant stored documents may be outputted by the system and method, in the order of relevance, as either identification numbers of potentially pertinent documents or as the documents per se in full text form.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
AN INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD CAPABLE OF HANDLING QUERIES IN SENTENCE FORM AND PRESENTING RESPONSES EITHER AS IDENTIFICATION NUMBERS OF POTENTIALLY PERTINENT DOCUMENTS OR AS DOCUMENTS IN FULL TEXT FORM. STORED DOCUMENTS CONTAINING SO-CALLED SEARCH SYMBOLS ARE KNOWN TO THE SYSTEM. A PROBABILITY CALCULATOR DETERMINES THE LIKELIHOOD OF THE OCCURRENCE OF A SEARCH SYMBOL AT LEAST ONCE IN A GIVEN STORED DOCUMENT IN THE SYSTEM''S DATA BASE. THE SYSTEM AND METHOD CAUSES THE QUERY TEXT TO BE SCANNED SO AS TO DETERMINE WHICH SEARCH SYMBOLS ARE CONTAINED THEREIN. THE TERM OVERLAP IS USED TO DESIGNATE A SEARCH SYMBOL IN THE QUERY AND ALSO IN A GIVEN STORED DOCUMENTS. THE PARTICULAR DOCUMENT IN THE SYSTEM''S DATA BASE HAVING THE SMALLEST JOINT PROBABILITY OF OCCURRENCE OF OVERLAP SEARCH SYMBOLS IN DESIGNATED AS HAVING THE HIGHEST RELEVANCE POTENTIAL WITHIN THE DATA BASE TO A GIVEN QUERY. THE STORED DOCUMENT HAVING THE NEXT LARGER JOINT PROBABILITY OF OCCURRENCE OF OVERLAP SEARCH SYMBOLS, HAS THE NEXT HIGHEST RELEVANCE POTENTIAL. IN THIS MANNER, ANY SELECTED NUMBER OF RELEVANT STORED DOCUMENTS MAY BE OUTPUTTED BY THE SYSTEM AND METHOD, IN THE ORDER OF RELEVANCE, AS EITHER IDENTIFICATION NUM-
BERS OF POTENTIALLY PERTINENT DOCUMENTS OR AS THE DOCUMENTS PER SE IN FULL FORM.
BERS OF POTENTIALLY PERTINENT DOCUMENTS OR AS THE DOCUMENTS PER SE IN FULL FORM.
Description
DEFENSIVE PUBLICATION UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice of Dec. 16, 1969, 869 0.G. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained may be purchased for 30 cents a sheet.
in the application as originally filed. The files of these applications are available to the public for inspection and reproduction Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Ofllce makes no assertion as to the novelty of the disclosed subject matter.
PUBLISHED JULY 18, 1972 T900,006 INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD Matthews P. Perriens, Rockville, Md., and John H.
Williams, Jr., Annandale, Va., assignors to Intemational Business Machines Corporation, Armonk, N. Continuation of application Ser. No. 736,837, June 13, 1968. This application Apr. 19, 1971, Ser. No. 135,467 Int. Cl. G06f 1/00, 7/00, 15/00 US. Cl. 340-1725 3 Sheets Drawing. 21 Pages Specification 1o I 2o 24 consonants JOINT INPUT ggfigggigg SUHSET usr pronoun I GENERATOR 'usr GENEMTOR l m l l I F i l l l I W l I l \i F i r PROBABILITY LIST ust CMBULATDR INVERTER REARRANGER PM i l DOCUMENT m SELECTOR An information storage and retrieval system and method capable of handling queries in sentence form and presenting responses either as identification numbers of potentially pertinent documents or as documents in full text form. Stored documents containing socalled search symbols are known to the system. A probability calculator determines the likelihood of the occurrence of a search symbol at least once in a given stored document in the systems data base. The system and method causes the query text to be scanned so as to determine which search symbols are contained therein. The term overlap is used to designate a search symbol in the query and also in a. given stored document. The particular document in the systems data base having the smallest joint probability of occurrence of overlap search symbols is designated as having the highest relevance potential within the data base to a given query. The stored document having the next larger joint probability of occurrence of overlap search symbols, has the next highest relevance potential. In this manner, any select number of relevant stored documents may be outputted by the system and method, in the order of relevance, as either identification numbers of potentially pertinent documents or as the documents per se in full text form.
July 18, 1972 Original Filed June 13, 1968 INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD 3 Sheets-Sheet '1 CONCORDANCE JOINT INPUT SUBSET LIST PROBABILITY GENERATOR LIST GENERATOR N r N 1 1 3| S2 S3 34 S5 S6 S7 S8 59 f v N R PROBABILITY LIST LIST CALCULATOR INVERTER REARRANGER c I I81 1 30) V I DOCUMENT SCANNER SELECTOR FIG I OUTPUT INVENTORS NmNEw P. RERRNENs JOHN R.NNL1NNs,NR.
ATTORNEY y 1972 M. P. PERRIENS A T900,006
INFORMATION STORAGE AND RETRIEVAL SYSTEM AND METHOD Original Filed June 15, 1968 s Sheets-Sheet" z INPUT I OUTPUT 42 T T G 2 INPUT OUTPUT CHANNEL 46 T N 48L 1 CORE STORAGE 4 7 ARITHMETIC CONTROL UNIT UNIT UNIT CONCORDANCE SEARCH SYMBOLS DOCUMENT NUMBERS SS| OOC| DOC DOC23 DOC F 3 s3 D003 DOC23 000 00c ss D002 D0C5 s s D007 000, 000, ooc DOC DOCUMENTS ss, 00c 00c U00 000 g 53 000 D006 F 4 5 S83 Uoc 000 00c 2 s3 D002 $3 U00 U00 s5 D0C4 U00 LIST I
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13546771A | 1971-04-19 | 1971-04-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
UST900006I4 true UST900006I4 (en) | 1972-07-18 |
Family
ID=22468234
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US900006D Pending UST900006I4 (en) | 1971-04-19 | 1971-04-19 | Information storage and retrieval system and method |
Country Status (1)
Country | Link |
---|---|
US (1) | UST900006I4 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5072367A (en) * | 1987-10-01 | 1991-12-10 | International Business Machines Corporation | System using two passes searching to locate record having only parameters and corresponding values of an input record |
-
1971
- 1971-04-19 US US900006D patent/UST900006I4/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5072367A (en) * | 1987-10-01 | 1991-12-10 | International Business Machines Corporation | System using two passes searching to locate record having only parameters and corresponding values of an input record |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4807182A (en) | Apparatus and method for comparing data groups | |
Franklin | On an improved algorithm for decentralized extrema finding in circular configurations of processors | |
US6523030B1 (en) | Sort system for merging database entries | |
US11321336B2 (en) | Systems and methods for enterprise data search and analysis | |
Negri et al. | Meaning in use | |
JPH0636168B2 (en) | Machine translation processor | |
UST900006I4 (en) | Information storage and retrieval system and method | |
Hutchinson | Protecting privacy in the archives: Supervised machine learning and born-digital records | |
Terasawa et al. | Locality sensitive pseudo-code for document images | |
US3293615A (en) | Current addressing system | |
Freeman | AUDACIOUS: An Experiment with an On-line, Interactive Reference Retrieval System Using the Universal Decimal Classification as the Index Language in the Field of Nuclear Science | |
Jaster et al. | The state of the art of coordinate indexing | |
Black | The keyword: its use in abstracting, indexing and retrieving information | |
Freeman | Computers and classification systems | |
Farradane et al. | A test of relational indexing integrity by conversion to a permuted alphabetical index | |
Oettinger et al. | Linguistic and machine methods for compiling and updating the harvard automatic dictionary | |
Bowman et al. | A chemically oriented information storage and retrieval system. I. storage and verification of structural information | |
Josselson | Research in machine translation | |
Scheponik et al. | LDA Topic Analysis of a Cybersecurity Textbook | |
Costello Jr | Computer requirements for inverted coordinate indexes | |
Magnuson Jr et al. | PL/M: a high level language for the INTEL MCS-8 8008 CPU | |
Macdonald et al. | The photoscopic language translator | |
Zaki et al. | A formal design of an Arabic Text Formatter for microcomputers | |
Yamaguchi | An approach to data compatibility: a generalized data access method. | |
Stone | Standards for computer-aided content analysis: The Pisa conventions and recommendations |