WO1998047081A1 - Procede de selection de donnes numeriques - Google Patents

Procede de selection de donnes numeriques Download PDF

Info

Publication number
WO1998047081A1
WO1998047081A1 PCT/EP1998/000932 EP9800932W WO9847081A1 WO 1998047081 A1 WO1998047081 A1 WO 1998047081A1 EP 9800932 W EP9800932 W EP 9800932W WO 9847081 A1 WO9847081 A1 WO 9847081A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
data
vector
vectors
database
Prior art date
Application number
PCT/EP1998/000932
Other languages
German (de)
English (en)
Inventor
Michael Buchstaller
Michael Mohr
Original Assignee
Dci Datenbank Für Wirtschaftsinformation Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dci Datenbank Für Wirtschaftsinformation Gmbh filed Critical Dci Datenbank Für Wirtschaftsinformation Gmbh
Priority to AU62971/98A priority Critical patent/AU6297198A/en
Publication of WO1998047081A1 publication Critical patent/WO1998047081A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2237Vectors, bitmaps or matrices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24561Intermediate data storage techniques for performance improvement

Definitions

  • the invention relates to a selection method for execution on a digital computer to optimize the selection and access speed of digital data stored on a storage medium, in particular address data.
  • the current epoch is often called
  • Information Age This is related to the possibility provided by electronic data processing to process large amounts of information within a comparatively short time.
  • Symptomatic is the constantly increasing number of databases, which should enable the user to get the information he needs or wants.
  • address databases are particularly noteworthy. They are becoming increasingly important, especially in business life, when it comes to establishing contacts with potential suppliers, customers, sales and cooperation partners or the like. which have a very specific requirement profile, i.e. meet specific criteria.
  • a high level of flexibility in the database system is particularly important in the area of address databases.
  • the reason for this is that, especially with extensive address databases, any changes to the entries have to be made practically permanently.
  • a registered company can expand its product range, merge with another - registered or unregistered - company, a company relating to a certain product group. Outsource part, relocate, change company name, go bankrupt and much more.
  • other important criteria are the number of data records entered, the specificity of the information or selection criteria provided for each data record, and the selection and access speed.
  • the majority of the databases used today are based on the hierarchical data model, the network data model or the relational data model. While the first two data models have a tree or network structure, the relational data model is based on tables. All information in a database, i.e. both the objects and their relationships, are represented in the same way by tables. Relationships between objects exist when a value occurs in several relations. Existing relationships are only activated when the database is queried. The execution of queries is relatively complex because the relations do not support a fast search algorithm. Either the individual tuples have to be searched sequentially for a feature, or the algorithm itself has to generate efficiency-enhancing auxiliary data structures.
  • This provides a selection process suitable for execution on a digital computer to optimize the Selection and access speed of digital data stored on a storage medium before that the data is stored in a structured manner in a data field and that search vectors relating to individual search criteria are logically linked to one another before the resulting result vector for the selection of data records which the cumulatively meet the desired search criteria with which the data field is linked.
  • search vectors which are each assigned to a specific search criterion or search term and contain the information as to which data record fulfills this criterion, are logically linked to one another, in particular via the AND, OR and NOT functions.
  • the amount of data to be moved is initially minimal, in particular if only those search vectors are moved that are actually of interest or meaning in the query or selection in question.
  • this requires only a minimal transmission time due to the comparatively small amount of data.
  • Only at the end of the linking operation between the search vectors is the resulting vector linked to the data field and the corresponding data or data records, which are only available once, are read from the stored data field.
  • the data field can have a cellular structure, with each line containing all the desired information about an entry. It is advantageous that the data fields are static, ie the maximum number of data records that can be entered has already been specified. For a data field that has not yet reached its final state, the remaining free lines are marked with zeros.
  • Search vectors are created by grouping data, such as country affiliation, etc. The search vectors can, but do not have to, be created in advance - possibly automatically - and are then already available when queried.
  • a preferred development of the invention is characterized by a regular automatic creation of search vectors. All data records - primarily at night - are read automatically, and all those data records that meet a specific search criterion are noted in the search vector corresponding to this criterion.
  • the search vector "country affiliation Taiwan” is given a one at all the places where the assigned data record gives some indication of Taiwan, for example in the form of the prefix number ++ 886 for the telephone or fax number, by the Country code ROC or TW when specifying the address or other characteristics.
  • the search vector “association” is given a one at all those places where the word “association” is contained in the corresponding data record in the description.
  • the advantage of the described method is that a conventional digital computer can be used as a fast online database. No "supercomputers" are required, since the access speed is very high even with complex queries and the memory requirement is very small compared to conventional databases. This is based, among other things, on make sure that all data appear only once, which is why the amount of data moved is small.
  • standard address hardware can be used, for example, to operate an address database which, with 3 million addresses entered and 1000 search criteria available, of which ten are combined in a query, for example, delivers the query result within seconds. Comparable is not possible in the use of known database systems, even if the hardware expenditure is many times greater.
  • An address database working according to the method according to the invention can also be extremely flexible respond to any change; It is not necessary to change the programming even when adding search vectors. This even allows the user of an address database to create his own, self-defined search vectors that can be taken into account in his query; For example, a search vector "black sheep", in which a one is assigned to all entries with which bad experiences have been made in the past, can be linked to the other search vectors when queried via the NICH function, as a result of which "black sheep”"be excluded from the outset. The result vectors of previous queries can also be saved as user-specific search vectors for future queries. All of this documents the unimagined flexibility of the method according to the invention. Another advantage is that search vectors can be created without blocking the database.
  • the method is therefore ideally suited for operation in a multiuser network database.
  • the method according to the invention can be used to carry out queries that are not possible in a conventional database that works with key fields. For example, all exhibitors at a trade fair can be identified in a very short time using the corresponding search vector, without this information necessarily being available in the data field itself.
  • An advantageous development of the method according to the invention provides that in each case a search vector consisting of zeros and ones is assigned to a specific search criterion or search term.
  • the search vector “headquarters in Germany” has ones where the corresponding entries are present in the data field, whereas other country entries lead to a zero in the corresponding search vector.
  • b queries can be queried with a single CPU instruction.
  • the data sets of the search vectors are namely on average a factor of nx 10 ° bit smaller than those data sets for sequential queries of complex terms in conventional databases.
  • the logical combination of zeros and ones is much closer to the machine than the use of complex search terms.
  • the very good ability to coordinate the search vectors is advantageous for network queries.
  • the use of such search vectors increases the computing and transmission speed and reduces the memory requirement.
  • an advantageous embodiment provides that when the search vectors are logically linked, a validation vector is first connected upstream. This means that in the validation vector, there are ones only in places where there are valid entries and otherwise zeros appear. This is necessary, for example, if old entries in a data record are to be invalidated. Then there would be a zero in front of this data record in the validation vector.
  • the data record can also be changed by changing the validation vector accordingly only make it temporarily invalid, for example if - in the case of a supplier database - the company concerned has temporary delivery difficulties.
  • the particularly high flexibility of the method according to the invention becomes clear.
  • an advantageous development of the method according to the invention provides that the search vectors are automatically adapted when new entries are made in the database. This means that a check routine is run with each new entry, which queries the search criteria or search terms corresponding to the search vectors for this one entry and assigns one or zero to the respective search vector at the position corresponding to this entry, depending on the input.
  • the result vectors which resulted from a combination of search vectors in an earlier query can also be used, e.g. Mailing list '97, are saved as a new search vector.
  • an advantageous embodiment of the method according to the invention provides that the digital data are stored on a read-only memory as a non-volatile storage medium.
  • a read-only memory as a non-volatile storage medium.
  • This can be, for example, a hard disk, a data tape, CD-ROM, EPROM or floppy disk.
  • the use of such a method on a digital computer (server) in a network for online query of a network database is advantageously provided, since high access and query speeds are required especially in the case of online queries in order to avoid unnecessary costs due to computing time or hardware use . This also ensures efficient multi-user operation.
  • the search vectors can be created without intervention in the data field, which means that downtimes are low.
  • the address field is fully populated with n entries.
  • the validation vector is first connected in front and logically linked by "and" to a first search vector that lists the subscribers listed in the address field.
  • a search vector which indicates the participants in the direct debit procedure listed in the address field is also connected by "and”.
  • the resulting vector is then linked to the address field and the result is the desired data sets.
  • the address field prepared for n entries is not fully occupied.
  • the validation vector has the entry 0 at the vacant positions.
  • the example illustrates the query of a dental laboratory to an address database to prepare a mailing that is to be used to establish new business relationships. The mailing is intended for all dentists whose practice is in the German postcode area 8, unless it is a dentist with an attached laboratory; furthermore, existing own customers should not receive the mailing.
  • the search vector "dentist" and the search vector "German postcode area 8" are linked to one another via the AND function and via the NOT function to the search vector "dental laboratory” and the search vector "own customer".
  • the result vector is then linked to the address field and the result is the desired data records.
  • the mailing is sent to A.B. and ai i P.Q. sent.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé approprié pour être mis en oeuvre sur un calculateur numérique en vue d'optimiser la vitesse d'accès à des données numériques stockées sur un support d'enregistrement. Ces données sont stockées de manière structurée dans une zone de données, et des vecteurs de recherche sont enchaînés tout d'abord logiquement les uns aux autres et ensuite avec la zone de données.
PCT/EP1998/000932 1997-04-15 1998-02-18 Procede de selection de donnes numeriques WO1998047081A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU62971/98A AU6297198A (en) 1997-04-15 1998-02-18 Digital data selection method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19715723.8 1997-04-15
DE19715723A DE19715723A1 (de) 1997-04-15 1997-04-15 Array-Verfahren

Publications (1)

Publication Number Publication Date
WO1998047081A1 true WO1998047081A1 (fr) 1998-10-22

Family

ID=7826580

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1998/000932 WO1998047081A1 (fr) 1997-04-15 1998-02-18 Procede de selection de donnes numeriques

Country Status (3)

Country Link
AU (1) AU6297198A (fr)
DE (1) DE19715723A1 (fr)
WO (1) WO1998047081A1 (fr)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002015045A2 (fr) * 2000-08-18 2002-02-21 Ser Technology Deutschland Gmbh Memoire associative
EP1323122A2 (fr) * 1999-09-03 2003-07-02 Whamtech, L.P. Processeur de rapports d'indices
US8015198B2 (en) 2001-08-27 2011-09-06 Bdgb Enterprise Software S.A.R.L. Method for automatically indexing documents
US8276067B2 (en) 1999-04-28 2012-09-25 Bdgb Enterprise Software S.A.R.L. Classification method and apparatus
US8321357B2 (en) 2009-09-30 2012-11-27 Lapir Gennady Method and system for extraction
US9152883B2 (en) 2009-11-02 2015-10-06 Harry Urbschat System and method for increasing the accuracy of optical character recognition (OCR)
US9159584B2 (en) 2000-08-18 2015-10-13 Gannady Lapir Methods and systems of retrieving documents
US9158833B2 (en) 2009-11-02 2015-10-13 Harry Urbschat System and method for obtaining document information
US9213756B2 (en) 2009-11-02 2015-12-15 Harry Urbschat System and method of using dynamic variance networks
WO2018106511A1 (fr) * 2016-12-06 2018-06-14 Amazon Technologies, Inc. Procédé de vente et de recherche d'articles
US10671944B2 (en) 2016-12-06 2020-06-02 Amazon Technologies, Inc. Method for selling and searching for items
US10824964B2 (en) 2016-12-06 2020-11-03 Amazon Technologies, Inc. Method for selling and searching for items

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989004013A1 (fr) * 1987-10-09 1989-05-05 Nucleus International Corporation Representation de base de donnees relationnelles capable d'operations sur une base de donnees relationnelles
EP0572807A2 (fr) * 1992-06-04 1993-12-08 International Business Machines Corporation Système de recherche de base de données utilisant une matrice bidimensionnelle de points

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3425844C2 (de) * 1984-07-13 1987-05-14 Telefonbau Und Normalzeit Gmbh, 6000 Frankfurt Schaltungsanordnung zum schnellen Auffinden von bestimmten, von einer Fernmeldevermittlungsanlage oder Datenverarbeitungsanlage produzierten Datensätzen in einem Datenspeicher

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1989004013A1 (fr) * 1987-10-09 1989-05-05 Nucleus International Corporation Representation de base de donnees relationnelles capable d'operations sur une base de donnees relationnelles
EP0572807A2 (fr) * 1992-06-04 1993-12-08 International Business Machines Corporation Système de recherche de base de données utilisant une matrice bidimensionnelle de points

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8276067B2 (en) 1999-04-28 2012-09-25 Bdgb Enterprise Software S.A.R.L. Classification method and apparatus
EP1323122A4 (fr) * 1999-09-03 2007-01-17 Whamtech L P Processeur de rapports d'indices
EP1323122A2 (fr) * 1999-09-03 2003-07-02 Whamtech, L.P. Processeur de rapports d'indices
US8209481B2 (en) 2000-08-18 2012-06-26 Bdgb Enterprise Software S.A.R.L Associative memory
AU2007221879C1 (en) * 2000-08-18 2011-07-14 Kofax International Switzerland Sàrl Associative memory
WO2002015045A3 (fr) * 2000-08-18 2004-02-26 Ser Technology Deutschland Gmb Memoire associative
AU2001282106B2 (en) * 2000-08-18 2008-04-24 Kofax International Switzerland Sàrl Associative memory
AU2007221879B2 (en) * 2000-08-18 2010-02-18 Kofax International Switzerland Sàrl Associative memory
US7908430B2 (en) 2000-08-18 2011-03-15 Bdgb Enterprise Software S.A.R.L. Associative memory
AU2001282106C1 (en) * 2000-08-18 2011-07-14 Kofax International Switzerland Sàrl Associative memory
US9159584B2 (en) 2000-08-18 2015-10-13 Gannady Lapir Methods and systems of retrieving documents
AU2010202012B2 (en) * 2000-08-18 2013-02-28 Kofax International Switzerland Sàrl Associative memory
WO2002015045A2 (fr) * 2000-08-18 2002-02-21 Ser Technology Deutschland Gmbh Memoire associative
EP1182577A1 (fr) * 2000-08-18 2002-02-27 SER Systeme AG Produkte und Anwendungen der Datenverarbeitung Mémiore associative
US6983345B2 (en) 2000-08-18 2006-01-03 Ser Solutions, Inc. Associative memory
US8015198B2 (en) 2001-08-27 2011-09-06 Bdgb Enterprise Software S.A.R.L. Method for automatically indexing documents
US9141691B2 (en) 2001-08-27 2015-09-22 Alexander GOERKE Method for automatically indexing documents
US8321357B2 (en) 2009-09-30 2012-11-27 Lapir Gennady Method and system for extraction
US9152883B2 (en) 2009-11-02 2015-10-06 Harry Urbschat System and method for increasing the accuracy of optical character recognition (OCR)
US9158833B2 (en) 2009-11-02 2015-10-13 Harry Urbschat System and method for obtaining document information
US9213756B2 (en) 2009-11-02 2015-12-15 Harry Urbschat System and method of using dynamic variance networks
WO2018106511A1 (fr) * 2016-12-06 2018-06-14 Amazon Technologies, Inc. Procédé de vente et de recherche d'articles
US10671944B2 (en) 2016-12-06 2020-06-02 Amazon Technologies, Inc. Method for selling and searching for items
US10824964B2 (en) 2016-12-06 2020-11-03 Amazon Technologies, Inc. Method for selling and searching for items

Also Published As

Publication number Publication date
DE19715723A1 (de) 1998-11-12
AU6297198A (en) 1998-11-11

Similar Documents

Publication Publication Date Title
DE69232425T2 (de) Sortierverfahren in einer verteilten Datenbank und Zugangsverfahren dazu
EP0855062B1 (fr) Systeme d'informations et procede de memorisation de donnees dans un systeme d'informations
DE69533193T2 (de) Paralleles verarbeitungssystem zum durchlaufen einer datenbank
DE68924525T2 (de) Gemeinschaftsobjektszustandsanzeige.
DE68929162T2 (de) Verteilte auf Objekte basierende Systeme
EP0910829B1 (fr) Systeme de banque de donnees
DE69708654T2 (de) Chipkarte mit gemäss einer baumartigen struktur gespeicherten dateien
DE69112694T2 (de) Verfahren zum Betrieb eines Datenverarbeitungssystems zur Ausführung von Datenbanktransaktionen.
DE69333960T2 (de) Namenauflösung in einem Mehrsystem-Netz
DE3689664T2 (de) Verfahren und Gerät zur Verwaltung von veralteten Datenobjekten.
DE3856055T2 (de) Verfahren und Einrichtung, um gleichzeitigen Zugriff zu indizierten sequentiellen Dateien zu ermöglichen
DE69400207T2 (de) Sprachabhängiges textvergleichssystem
DE69729926T2 (de) Netzwerkbrowser
DE69407287T2 (de) Datenwiederauffindungssystem
DE10031716B4 (de) Abonnement und Benachrichtigung bei Datenbanktechnik
WO1998047081A1 (fr) Procede de selection de donnes numeriques
DE69718085T2 (de) Kompression von strukturierten Daten
DE3782349T3 (de) Netzwerk-Ereignisseidentifikationseinrichtungen.
EP0791884A2 (fr) Procédé pour l'ajustement informatisé de plusieurs copies de fichiers d'un fichier stocké, stockées dans au moins un ordinateur
DE69522426T2 (de) Wort-Wiederauffindungsapparat für ein Wörterbuch
DE69726795T2 (de) Kodierungsverfahren des Verzeichnisstatus in einem gemeinsamen, verteilten Speichersystem mit koherenten Pufferspeichern
DE69932147T2 (de) Kommunikationseinheit und Kommunikationsverfahren mit Profilverwaltung
EP1276056B1 (fr) Méthode pour administrer une base de données
DE69725899T2 (de) Verfahren zur Instandhaltung eines Netzwerks von teilweise replizierten Datenbanken
EP0174556B1 (fr) Procédé de recherche d'adresses mémoire et circuit comparateur d'adresses

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AL AU BA BB BG BR CA CN CU CZ EE GE GW HU ID IL IS JP KP KR LC LK LR LT LV MG MK MN MX NO NZ PL RO SG SI SK SL TR TT UA US UZ VN YU

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: CA

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

Ref document number: 1998543398

Format of ref document f/p: F