EP1034490A1 - Extraction et stockage de donnees a l'aide d'identificateurs uniques - Google Patents

Extraction et stockage de donnees a l'aide d'identificateurs uniques

Info

Publication number
EP1034490A1
EP1034490A1 EP98955217A EP98955217A EP1034490A1 EP 1034490 A1 EP1034490 A1 EP 1034490A1 EP 98955217 A EP98955217 A EP 98955217A EP 98955217 A EP98955217 A EP 98955217A EP 1034490 A1 EP1034490 A1 EP 1034490A1
Authority
EP
European Patent Office
Prior art keywords
identification number
group
document
unique
directories
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP98955217A
Other languages
German (de)
English (en)
Other versions
EP1034490A4 (fr
Inventor
Marius Van Tonder
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HANDELMAN Joseph H
Top Info Outsourcing Services Pty Ltd
Original Assignee
HANDELMAN Joseph H
Top Info Outsourcing Services Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HANDELMAN Joseph H, Top Info Outsourcing Services Pty Ltd filed Critical HANDELMAN Joseph H
Publication of EP1034490A1 publication Critical patent/EP1034490A1/fr
Publication of EP1034490A4 publication Critical patent/EP1034490A4/fr
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices

Definitions

  • THIS INVENTION relates to data storage and retrieval. It relates in particular to a method of storing a plurality of documents in a database and to a method of retrieving data from a database. Further, it relates to an arrangement of data in a database.
  • a typical application of the storage of documents in the form of digital images is in the medical field e.g. the storage of a claim document submitted by a doctor to a medical aid fund.
  • a digital image of the original claim document is stored in a storage medium e.g. a CD ROM or the like.
  • Selected information is also manually read from the claim document and entered into an independent storage medium thereby creating an abridgement of the original document. Retrieval of the information from the independent storage medium is generally fairly rapid. However, in the event of full details on the claim being required, the digital image of the original document is usually required.
  • an arrangement of data in a database including a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in said group of file locations divided by NFG files.
  • the path to a selected document may be derived from the unique primary identification number.
  • the number of file locations may be a preselected number of file locations which corresponds to a number of documents which are capable of being stored in at least a particular section of the database.
  • the file locations may include digital images of the documents which are captured by means of a conventional scanner.
  • the file locations may be defined by a preselected number of base directories, each group of the file locations including NFG base directories and each base directory being designated by a primary identification number.
  • the number of base directories in each group of file locations may be less than about 1000, preferably less than about 250.
  • Each secondary identification number may be associated with a directory used in a conventional computer system, the directory being designated by the secondary identification number and each base directory being defined by a sub-directory of said directory.
  • the primary identification number is typically the document number and the documents are preferably sequentially numbered.
  • each unique secondary identification number may be associated with, typically being the name or label of, a directory of a conventional directory /sub-directory arrangement used in conventional computer systems and each unique primary identification number may be associated with, typically being the name or label of, a sub-directory of said directory.
  • each file location may be a sub-directory in which a digital image of the document is stored and which is labelled or named with the unique primary identification number associated with the document.
  • the unique primary identification number is typically the document number.
  • the database is typically arranged in a hierarchical or so-called "root" structure of directories and in which the file locations are each defined by a subdirectory at a base level L B in the hierarchical structure.
  • Each group of directories at one level above the base level (level L B + 1 ) in the structure may include NFGL B + 1 sub-directories each of which has a secondary identification number designated by the absolute value of the document number of the first document in the group divided by NFGL B + 1 .
  • Each level L B + n may include a plurality of groups of directories, each group of directories including NFGL B + n directories at an immediately lower level
  • Each group of directories at level L B + n may include a unique secondary identification number which is defined by the absolute value of the unique secondary identification number of the first sub-directory in the group of directories at level L B + n . 1 divided by NFGL B + n .
  • the number of groups of directories at level L B + n is typically between about 2 and about 10 times the number of groups of directories at level L B + n . 1 .
  • an even number of groups of directories NFGL B + n is provided at each level L B + n .
  • a method of storing a plurality of documents in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of each document to be stored in the database; determining the secondary identification number by taking the absolute value of the primary identification number of the document to be stored and dividing it by NFG; and storing the document in a file location in the form of a directory which is identified from the unique primary and secondary identification numbers.
  • the database may be arranged in a hierarchical directory structure.
  • the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
  • the method include scanning an original copy of the document to obtain a digital image thereof, and storing the digital image of the document in the file location.
  • a method of identifying a path to one of a plurality of file locations in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database; and dividing the unique primary identification number by the number of file locations NFG in the group and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document.
  • the database may be arranged in a hierarchical directory structure.
  • the method may include iteratively identifying an associated directory of a group of directories at one level higher (level L B + n + 1 ) by dividing the unique secondary identification number of the directory at level L B + n by the number of directories NFGL B + n + 1 in the group.
  • a method of retrieving data from a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number; and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files, the method including identifying the unique primary identification number of the document to be retrieved from the database, and dividing the number of file locations NFG in the group by the unique primary identification number, and taking the absolute value of the result to obtain the unique secondary identification number of the group in which the document lies thereby to identify the path to the selected document; and reading the data stored in the directory via the path.
  • the unique primary identification number may be associated with a name of a legal entity e.g. a natural person, a business, or the like.
  • the method may include searching for the name of the legal entity in a conventional manner and retrieving the unique primary identification number thereby to identify a name and path of the directory in which the document has been stored.
  • a data management installation which includes reading means for reading data from data storage means which includes a digital image of a plurality of documents arranged in an hierarchical structure in a database which includes a selected number of file locations, each file location including a document which includes a unique primary identification number and the file location being identified from the unique primary identification number, and a plurality of groups of the file locations, each group including NFG files in the group and including a unique secondary identification number which is defined by the absolute value of the first unique primary identification number of a file location in a particular group of file locations divided by NFG files; input means for receiving the primary identification number of a document to be retrieved from the storage means; and processing means arranged to identify a path in the hierarchical structure to the file location in which the document has been stored, the path being derived from a unique primary identification number of the document.
  • the installation may include interface means for interfacing the installation to a conventional data management installation which selectively accesses abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
  • abridgements of documents e.g. abridgements of medical aid claims or the like which have been stored in the form of a digital image.
  • the installation may be arranged to receive a document number from the abridgement, the document number being translated into a unique primary identification number thereby to permit a digital image of the entire original document to be retrieved.
  • the storage means may be a plurality of CD ROMs which define the database, the reading means being a so-called "CD jukebox" .
  • Figure 1 shows a schematic diagram of a data management installation in accordance with the invention
  • Figure 2 shows a schematic diagram of an arrangement of data in a database, also in accordance with the invention, of the installation of Figure 1 ; and Figure 3 shows a sub-section of a directory structure which is arranged in a similar fashion to that of Figure 2.
  • reference numeral 10 generally indicates a data management installation in accordance with the invention.
  • the installation 10 includes a data capturing sub-section 1 2, a data storage sub-section 14, and a data retrieval sub-section 1 6.
  • the installation 10 is configured or arranged to store a digital image of each of a substantial number of documents for subsequent retrieval from a database arrangement as described in more detail below.
  • the data capturing sub-system 12 includes a conventional digital scanner 1 8 which scans a substantial number of documents 20 and feeds a digital image of each document 20 into storage means 22 as shown by arrow 24. Once the documents 20 have been scanned, they are physically stored in a warehouse or discarded as indicated by arrow 26.
  • the documents 20 are numbered sequentially and each number defines a unique primary identification number which is associated with a particular document 20.
  • the installation 10 includes interface means as indicated by arrow 28, for interfacing the installation 1 0 to a conventional computer system
  • the documents 20 are typically claim forms received by a medical aid company from various medical practitioners.
  • the data management installation 1 0 stores each document 20 in a unique fashion in an arrangement of data in a database in the storage means 22.
  • the database is arranged in an hierarchial or so-called "root" structure 32 (see Figure 2) .
  • the structure 32 of the arrangement includes 100 directories or file locations only a few of which are referenced in the drawings by reference numeral 34.
  • the file locations or directories 34 are divided into a number of groups of directories 36.1 , 36.2, 36.3 and so on, each comprising 10 directories 34 or number of files in the group (NFG) .
  • Each directory 34 has as its title or name the unique primary identification number of a document 20 intended to be stored therein. Thus, each document 20 is stored in a specific location to facilitate subsequent retrieval thereof.
  • the directories 34 are located at a base level L B in the hierarchial structure 32 as indicated by arrow 38.
  • the group of directories 36.1 is associated with a directory 40 at a level L B + 1 , which is one level higher in the hierarchial structure 32.
  • further directories 42 to 58 are provided at level L B + 1 each of which are associated with 10 file locations (NFG) or directories 34 at level l_ B , each file location or directory 34 bearing the name or label of the unique identification number of the document 20 to be stored therein.
  • the directories 40 to 58 at level L B + 1 are grouped into two groups of directories 60, 62, at a level L B + 2 , each group having 5 (NFGL B + 2 ) sub-directories.
  • the groups of directories 60, 62 are grouped or branch out from a further directory 64 which bears a label " 1 -100" and which is thus representative of the range of documents 20 having unique primary identification numbers between 1 and 1 00 which are associated with the directory.
  • the directory 64 has 2 (NFGL B + 3 ) directories in its group.
  • the various names of the directories 64, 60, 62, 40, to 58, and 34 are in the form of reference numerals which are allocated in a specific fashion.
  • the name of the directory 40 defines a unique secondary identification number which is defined by the absolute value of the first unique primary identification number 34.1 in the group of directories 36.1 divided by the total number of directories or file locations in NFG in the group of directories 36.1 .
  • the name of the directory 40 is then defined by the absolute value of 0 divided by 10 which is 0 as show in Figure 2.
  • the directory 42 its name or label is defined by the absolute value of the first unique primary identification number 34.2 in a second group of directories 36.2 divided by the number of files or directories in NFG in the particular group, i.e. the absolute value of 10 divided by 10 which is equal to 1 .
  • the unique secondary identification numbers which define the names of the directories 44 to 58 are determined.
  • the label or name of the group of directories 60 is defined by the absolute value of the unique secondary identification number "0" which is the name of the first group of directories 40 at an immediately lower level L B + n . 1 ; divided by the number of groups of directories at an immediately lower level, i.e. 5 thus providing a result of 0 as shown in Figure 2.
  • the name of the group of directories 62 is derived by the first unique secondary identification number which is the file name of the group of directories 50, i.e. 5 divided by 5 (NFGL B + 2 ) which equals 1 .
  • the hierarchial structure may comprise a plurality of different levels.
  • the number of different levels depends upon the number of documents which are to be stored in the hierarchy. Further, the fewer the number of levels, i.e. the flatter the hierarchial structure is, the more simple the path is to the particular directory in which the document is stored and thus retrieval times may be reduced in comparison to a very pointed hierarchial structure in which a number of levels are included.
  • a plurality of hierarchial structures one of which is shown in Figure 3 which are independent of each other may be used.
  • the number of file locations or directories 34 at the base level L B in the hierarchial structure 38 is typically less then about 1000 and, more preferably, less than about 250.
  • the hierarchial structure 38 may thus include a plurality of levels extending above base level L B , each level including a group of directories at a level L B + n having NFGl- B + n directories in the group.
  • each directory in a group of directories branches out or extends into NFGL B + n groups of directories at an immediately lower level L B + n.1 .
  • the name or secondary identification number of each group of sub-directories is then determined in a similar fashion as described above.
  • the number of groups of directories at level L B + n are typically between about 2 and about 10 times the number of groups of directories at level L B + n .
  • the number of levels L B + n is dependent upon the number of documents at the base level l_ B in the hierarchial structure.
  • a digital image of each document 20 is stored on a plurality of compact discs 70 as shown in Figure 1 .
  • the compact discs 70 may form part of a library of information on various transactions or claims which have been submitted to the medical aid via the various doctors.
  • Certain of the compact discs may be loaded in a CD jukebox 72 to provide a near line facility and other compact discs may be loaded in a CD tower 74 to provide an on-line facility as shown by arrows 76, 78 respectively.
  • the database is stored on a magnetic media 80.
  • the installation 10 includes computing means 82 (see Figure 1 ) which is arranged to generate a variety of user friendly screens to assist in instructing the computing means 82 to perform various retrieval functions.
  • the computing means 82 is programmed in such a fashion so that an indexed field window 84 prompts a user to enter a client name 86 via a keyboard (not shown).
  • the computing means 82 retrieves the unique primary identification number 88 which is associated with the client name 86.
  • the unique primary identification number is then fed to a unique key of documents screen 90 which has a search prompt 92 which may be activated with a mouse to initiate retrieval of a selected document from the database.
  • the path to the particular file location or directory 34 in which the document has been stored is derived directly from the unique primary identification number which defines the name of the file location or directory 34 in which the document 20 has been stored.
  • the relevant directories in the groups of directories, at the various levels L B + n in the hierarchial structure 32 must be determined.
  • the name of the actual directory 34 in which the document 20 has been stored is determined as indicated above and the particular directory in the group of directories is then determined by taking the absolute value of the unique primary identification number or document number divided by the number of file locations or directories 34 NFG at base level L B .
  • the path to the relevant document may be reconstructed and thus retrieval time may be reduced.
  • the unique primary identification number i.e. " 10" is divided by the number of the number of groups of directories NFG at the base level L B i.e. " 1 0" and the absolute value thereof is taken, i.e. directory 42 labelled " 1 " is identified at level L B + 1 .
  • the absolute value is taken of the unique primary identification number or document number 1 5 divided by the number of files in the group of directories 36, i.e. the result is the absolute value of 1 .5 which is 1 .
  • a particular sub-directory at level L B + n is determined and thus the path to the document may be determined.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Processing Or Creating Images (AREA)

Abstract

La présente invention concerne un agencement de données dans une base de données. Cet agencement comprend un nombre sélectionné d'implantations (34) de fichiers, chaque implantation (34) comprenant un document pourvu d'un numéro (34-1) d'identification primaire unique. L'agencement comprend, en outre, une pluralité de groupes (36) d'implantation de fichiers, chaque groupe (36) comprenant un nombre spécifique de fichiers NFG et un numéro (40) d'identification secondaire unique défini par la valeur absolue du premier numéro d'identification primaire unique d'une implantation de fichiers dudit groupe partagé par des fichiers NFG. Le nombre d'implantation de fichiers est un nombre présélectionné qui correspond à un nombre de documents qui peuvent être stockés dans au moins une section particulière de la base de données.
EP98955217A 1997-11-03 1998-10-30 Extraction et stockage de donnees a l'aide d'identificateurs uniques Withdrawn EP1034490A4 (fr)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
ZA979873 1997-11-03
ZA9709873 1997-11-03
PCT/US1998/023125 WO1999023583A1 (fr) 1997-11-03 1998-10-30 Extraction et stockage de donnees a l'aide d'identificateurs uniques

Publications (2)

Publication Number Publication Date
EP1034490A1 true EP1034490A1 (fr) 2000-09-13
EP1034490A4 EP1034490A4 (fr) 2001-02-07

Family

ID=25586686

Family Applications (1)

Application Number Title Priority Date Filing Date
EP98955217A Withdrawn EP1034490A4 (fr) 1997-11-03 1998-10-30 Extraction et stockage de donnees a l'aide d'identificateurs uniques

Country Status (6)

Country Link
EP (1) EP1034490A4 (fr)
AP (1) AP2000001798A0 (fr)
AU (1) AU1207499A (fr)
CA (1) CA2307226A1 (fr)
WO (1) WO1999023583A1 (fr)
ZA (1) ZA989947B (fr)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56162162A (en) * 1980-05-16 1981-12-12 Toshiba Corp Data storing device having variable data structure

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5058162A (en) * 1990-08-09 1991-10-15 Hewlett-Packard Company Method of distributing computer data files
US5204958A (en) * 1991-06-27 1993-04-20 Digital Equipment Corporation System and method for efficiently indexing and storing a large database with high data insertion frequency

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS56162162A (en) * 1980-05-16 1981-12-12 Toshiba Corp Data storing device having variable data structure

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Location of Serial-Numbered Files in a Hierarchical Store" IBM TECHNICAL DISCLOSURE BULLETIN, vol. 37, no. 5, 1 May 1994 (1994-05-01), pages 129-130, XP002154100 New York, US *
PATENT ABSTRACTS OF JAPAN vol. 006, no. 046 (P-107), 24 March 1982 (1982-03-24) & JP 56 162162 A (TOSHIBA CORP), 12 December 1981 (1981-12-12) *
See also references of WO9923583A1 *

Also Published As

Publication number Publication date
EP1034490A4 (fr) 2001-02-07
AP2000001798A0 (en) 2000-06-30
ZA989947B (en) 1999-05-13
WO1999023583A1 (fr) 1999-05-14
AU1207499A (en) 1999-05-24
CA2307226A1 (fr) 1999-05-14

Similar Documents

Publication Publication Date Title
US6477528B1 (en) File management system, electronic filing system, hierarchical structure display method of file, computer readable recording medium recording program in which function thereof is executable
US7272610B2 (en) Knowledge management system
US9087101B2 (en) Document management techniques to account for user-specific patterns in document metadata
US7765191B2 (en) Methods and apparatus for managing the replication of content
US7246170B2 (en) Scheme for systematically registering meta-data with respect to various types of data
US5740445A (en) Information processing apparatus for generating directory information to manage a file using directories
EP0846298B1 (fr) Systeme de stockage et restitution de documents et donnees electroniques
US7392235B2 (en) Methods and apparatus for retrieval of content units in a time-based directory structure
US6549913B1 (en) Method for compiling an image database, an image database system, and an image data storage medium
RU2378685C2 (ru) Устройство управления файлами, способ управления этим устройством, компьютерная программа и носитель данных
US7401078B2 (en) Information processing apparatus, document search method, program, and storage medium
US10114821B2 (en) Method and system to access to electronic business documents
US20020049731A1 (en) Information processing method and apparatus
JP2001510673A (ja) キーフレームおよびブロブの検索および格納を最適化する装置および方法
WO1997007468A9 (fr) Systeme de stockage et restitution de documents et donnees electroniques
US7333992B2 (en) System and method for identifying and storing changes made to a table
US20070124272A1 (en) System and Method for Collecting and Compiling Data in a Computer Network
US5884321A (en) Document image and query management system for application databases
Arms Getting the picture: Observations from the library of congress on providing online access to pictorial images
KR100296574B1 (ko) 착탈식대용량저장매체에아카이브를생성하기위한방법및아카이브서버
US20030101199A1 (en) Electronic document processing system
US20060235893A1 (en) Methods and apparatus for managing the storage of content
EP1116137B1 (fr) Base de donnees et methodes de memorisation et d'extraction de donnees
EP1034490A1 (fr) Extraction et stockage de donnees a l'aide d'identificateurs uniques
JP2001075954A (ja) 電子ファイリングシステムおよびデータ登録方法

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20000518

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

A4 Supplementary search report drawn up and despatched

Effective date: 20001227

AK Designated contracting states

Kind code of ref document: A4

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20010309