UST912008I4 - Method for reordering the records of a file - Google Patents

Method for reordering the records of a file Download PDF

Info

Publication number
UST912008I4
UST912008I4 US912008DH UST912008I4 US T912008 I4 UST912008 I4 US T912008I4 US 912008D H US912008D H US 912008DH US T912008 I4 UST912008 I4 US T912008I4
Authority
US
United States
Prior art keywords
block
string
records
file
strings
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed filed Critical
Application granted granted Critical
Publication of UST912008I4 publication Critical patent/UST912008I4/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/22Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
    • G06F7/24Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers sorting methods in general

Definitions

  • the method first entails the performing of an internal sort by replacement selection on the file to provide a plurality of strings of records, the records constituting the respective strings being in ordered sequences, the strings being consecutively numbered as they are created in the internal sort, viz., first to mth strings. If desired, the records in the strings can be blocked, i.e., each block containing 1, 2, k records as desired.
  • the records are then written onto a direct access storage device such as a disk or a drum in a manner whereby the first block of the first string is placed on the first sector of the first track of the storage, the second block of the first string is Placed on the second sector of the first track, etc., the first block of the second string is placed on the second sector of the second track, the first block of the third string is placed on the third sector of the third track, et seq.
  • B represents the jth block of the ith string
  • the blocks are written onto the storage device as follows:
  • FIG. 1 INPUT FILE i [2 I 22] 1 [2?]1 lsolszf a [zahshelzslzei 5 l4 lnlmlzohsle [24] s ⁇ 25]2 [18115 [31110129] 9111M FIG. 2
  • FIG. 5 METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 6 Sheets-Sheet 2
  • FIG. 5 METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 6 Sheets-Sheet 2

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

THERE IS DESCRIBEDC HEREIN A METHOD FOR THE REORDERING OF A FILE WHEREBY, AFTER SUCH REORDERING, THE RECORDS CONSTITUTING THE FILE ARE ARRANGED SUCH THAT THE ORDER OF THEIR KEY VALUES IS BIASED IN A SEQUENTIAL DIRECTION. THE FILE IS THUS ADVANTAGEOUSLY PRECONDITIONED FOR BEING SORTED, PARTICULARLY BY A DISTRIBUTION SORT TECHNIQUE. THE METHOD FIRST ENTAILS THE PERFORMING OF AN INTERNAL SORT BY REPLACEMENT SELECTION ON THE FILE TO PROVIDE A PLURALITY OF STRINGS OF RECORDS, THE RECORDS CONSTITUTING THE RESPECTIVE STRINGS BEING IN ORDERED SEQUENCES, THE STRINGS BEING CONSECUTIVELY NUMBERED AS THEY ARE CREATED IN THE INTERNAL SORT, VIZ., FIRST TO MTH STRINGS. IF DESIRED, THE RECORDS IN THE STRINGS CAN BE BLOCKED, I.E., EACH BLOCK CONTAINING 1, 2, . . . , K RECORDS AS DESIRED. THE RECORDS ARE THEN WRITTEN ONTO A "DIRECT ACCESS" STORAGE DEVICE SUCH AS A DISK OR A DRUM IN A MANNER WHEREBY THE FIRST BLOCK OF THE FIRST STRING IS PLACED ON THE FIRST SECTOR OF THE FIRST TRACK OF THE STORAGE, THE SECOND BLOCK OF THE FIRST STRING IS PLACED ON THE SECOND SECTOR OF THE FIRST TRACT, ETC., THE FIRST BLOCK OF THE SECOND STRING IS PLACED ON THE SECOND SECTOR OF THE SECOND TRACK, THE FIRST BLOCK OF THE THIRD STRING IS PLACED ON THE THIRD SECTOR OF THE THIRD TRACK, ET SEQ. THUS, IF BIJ REPRESENTS THE JTH BLOCK OF THE ITH STRING, THEN THE BLOCKS ARE WRITTEN ONTO THE STORAGE DEVICE AS FOLLOWS:

SECTORS TRACKS 1 2 3 N 1 B11 B12 B13 . . . B1N 2 B21 B22 . . . B2, N-1 3 B31 . . . B3 N-2 . . . . . . M BM1 . . . BM, N-M+1 . . . WITH THE RECORDS SO ARRANGED ON THE DIRECT ACCESS STORAGE DEVICE, THEY ARE READ DIAGONALLY THEREFROM, I.E., THE FIRST BLOCK OF THE FIRST STRING FOLLOWED BY THE FIRST BLOCK OF THE SECOND STRING ET SEQ., THEN THE SECOND BLOCK OF THE FIRST STRING FOLLOWED BY THE SECOND BLOCK OF THE SECOND STRING, ET SEQ., TO PRODUCE A FILE HAVING THE ORDER B11, B21, . . . . BML, B12, B22, . . . BM2, . . . , BM3, . . . , BMN WITH THE PLACING OF THE BLOCKS ON THE "DIRECT ACCESS" STORAGE DEVICE IN THIS MANNER, THE ARRANGEMENT IS EFFECTED WHEREBY THE BEGINNING OF THE NEXT BLOCK TO BE READ/WRITTEN IS ROTATIONALLY ADJACENT TO THE END OF THE LAST BLOCK WHICH WAS READ/WRITTEN WHEREBY LATENCY AND SEEK TIME ARE SIGNIFICANTLY MINIMIZED. THE RECORDERING OF THE FILE PRODUCED FROM THE READING OUT OF THE BLOCKS IN THE DIAGONAL MANNER AS DESCRIBED HEREINABOVE CAN NOW BE ADVANTAGEOUSLY UTILIZED IN A DISTRIBUTION SORT, SUCH AS OF THE TAG OR RECORD TYPES.

Description

DEFENSIVE PUBLICATION UNITED STATES PATENT OFFICE Published at the request of the applicant or owner in accordance with the Notice of Dec. 16, 1969, 869 0.6. 687. The abstracts of Defensive Publication applications are identified by distinctly numbered series and are arranged chronologically. The heading of each abstract indicates the number of pages of specification, including claims and sheets of drawings contained in the application as originally filed. The files of these applications are available to the public for inspection and reproduction may be purchased for 30 cents a sheet.
Defensive Publication applications have not been examined as to the merits of alleged invention. The Patent Ofiice makes no assertion as to the novelty of the disclosed subject matter.
PUBLISHED JULY 10, 1973 T912,008 METHOD FOR REORDERING THE RECORDS OF A FILE Brian T. Bennett, Mohegan Lake, and Archie C. McKellar, Mount Kisco, N.Y., assignors to International Business Machines Corporation, Armonk, NY.
Filed Dec. 26, 1972, Ser. No. 318,335 Int. Cl. G06f 7/24 US. Cl. 444-1 6 Sheets Drawing. 40 Pages Specification CYLINDER NUMBER TRACK NUMBER QU-5USN- There is described herein a method for the reordering of a file whereby, after such reordering, the records constituting the file are arranged such that the order of their key values is biased in a sequential direction. The file is thus advantageously preconditioned for being sorted, particularly by a distribution sort technique. The method first entails the performing of an internal sort by replacement selection on the file to provide a plurality of strings of records, the records constituting the respective strings being in ordered sequences, the strings being consecutively numbered as they are created in the internal sort, viz., first to mth strings. If desired, the records in the strings can be blocked, i.e., each block containing 1, 2, k records as desired. The records are then written onto a direct access storage device such as a disk or a drum in a manner whereby the first block of the first string is placed on the first sector of the first track of the storage, the second block of the first string is Placed on the second sector of the first track, etc., the first block of the second string is placed on the second sector of the second track, the first block of the third string is placed on the third sector of the third track, et seq. Thus, if B represents the jth block of the ith string, then the blocks are written onto the storage device as follows:
Sectors Tracks 1 2 3 n 1 B11 B 2 B13 B1 1 c 2 B 1 B22 B -1 3 B31 B3, n-fl m B Bm, n-m+1 With the records so arranged on the direct access storage device, they are read diagonally therefrom, i.e., the first block of the first string followed by the first block of the second string et seq., then the second block of the first string followed by the second block of the second string, et seq., to produce a file having the order With the placing of the blocks on the direct access storage device in this manner, the arrangement is effected whereby the beginning of the next block to be read/written is rotationally adjacent to the end of the last block which was read/written whereby latency and seek time are significantly minimized. The reordering of the file produced from the reading out of the blocks in the diagonal manner as described hereinabove can now be advantageously utilized in a distribution sort, such as of the tag or record types.
- 2 ull: 12 22, I
July 10, 1973 a. T. BENNETT ETAL T912,008
METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 v 6 Sheets-Sheet 1 g FIG. 1 INPUT FILE i [2 I 22] 1 [2?]1 lsolszf a [zahshelzslzei 5 l4 lnlmlzohsle [24] s {25]2 [18115 [31110129] 9111M FIG. 2
mmjzzfzflaobfl [fishshefzajzefzi] [shlmfnlzolzi] mshalwjzslsfl Lzlwhslzs] W51] CYLINDER NUMBER FIG. 3
TRACK NUMBER 0 01 b u N FIG.4
July 10, 1973 B. T. BENNETT ETAL 1912,008
METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 6 Sheets-Sheet 2 FIG. 5
KEY VALUE STRINGS FIG. 6
KEY VALUE REORDERED FILE 11 12 15 m 2u B21 B22 B2 -1 FIG 7 5u-1 3u 31 July 10, 1973 B. T. BENNETT ETAL T912,008
METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 6 Sheets-Sheet .'T
FlG. 9 LOAD MAIN sTORE AREA WITH G ITEMS FROM x10 PR|QR ART THE INPUT sEOuEMcE. N0 ITEMS ARE MARKED.
T 12 ARE ALL ITEMS MARKED? YES 14 7N0 22 TEST FOR END EMO sTRTMO. uMM RK MARKED 0F INPUT SEQUENCE? ITEMs. sTART NEXT STRING. 16 N0 YES I COMPARE ITEM FROM INPUT SEOUENCEWITH THE SMALLEST ,5 UNMARKED ITEM IN HAINSTORE. APPEND UNMARKED ITEMS /25 IS TMETMPOT I EM LARGER? W MAIN STORE T0 N0 V YES OORREMT STRING IN ORDER MARKTME INPUT ITEM 20 sORT MARKED TTEMs IN MAIN STORE AMO OuTPuT AS FINAL APPEND TME SMALLEST UNMARKED ITEM STRING m MAIN sTORE TO cuRREMT OUTPUT STRING; REPLACE IT IN MAIN sTORE BY THE INPUT ITEM. END
Fl GAO July 10, 1973, a. T. BENNETT ETAL T9l2,00 8
METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec 26, 1972 6 Sheets-Shae}. 4
CYLINDER NUMBER FIG." 2 3 I TRACK NUMBER I 2 3 4 5 6 7 8 9 10 II. I2 I3 I4=ADDRESS INPUT THE FIRST ITEM FROM EACH STRING TO BE MERGED.
I Fl SELECT THE SMALLEST ITEM. OUTPUT IT AND REILACE/ PRIOR ART IT IF POSSIBLE WITH THE NEXT OCCURRING ITEM ON" THE SAME STRING ANY ITEMS LEFT TO BE MERGED NO YES END July 10, 1973 B. T. BENNETT ETAL T912303 METHOD FOR REORDERING THE RECORDS OF A FILE Filed Dec. 26, 1972 6 Sheets-Sheet 5 1 2 3 4 5 s 7 a s 1011 121314=RECORDS'KEYS F1645 23659 11211114 1141510=ADDRESSES STRING 1 STRING 2 FIGJIB6121043118514911312 BUCKET 2 BUCKET 1
US912008D 1972-12-26 1972-12-26 Method for reordering the records of a file Pending UST912008I4 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US31833572A 1972-12-26 1972-12-26

Publications (1)

Publication Number Publication Date
UST912008I4 true UST912008I4 (en) 1973-07-10

Family

ID=23237745

Family Applications (1)

Application Number Title Priority Date Filing Date
US912008D Pending UST912008I4 (en) 1972-12-26 1972-12-26 Method for reordering the records of a file

Country Status (1)

Country Link
US (1) UST912008I4 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4536857A (en) 1982-03-15 1985-08-20 U.S. Philips Corporation Device for the serial merging of two ordered lists in order to form a single ordered list
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4536857A (en) 1982-03-15 1985-08-20 U.S. Philips Corporation Device for the serial merging of two ordered lists in order to form a single ordered list
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted

Similar Documents

Publication Publication Date Title
US4611310A (en) Method and system for rearranging data records in accordance with keyfield values
US6584555B2 (en) Information storage and retrieval system
US5199073A (en) Key hashing in data processors
US20100191749A1 (en) Method for performing an external (disk-based) sort of a large data file which takes advantage of "presorted" data already present in the input
US5855016A (en) Sorting system for serially processing records
WO2014074494A1 (en) Data search using bloom filters and nand based content addressable memory
CN111324750B (en) Large-scale text similarity calculation and text duplicate checking method
WO2014074490A1 (en) De-duplication system using nand flash based content addressable memory
WO2014074487A1 (en) De-duplication techniques using nand flash based content addressable memory
CN113342750A (en) File data comparison method, device, equipment and storage medium
AU2006334660A1 (en) Method and apparatus for recording high-speed input data into a matrix of memory devices
US5319651A (en) Data integrity features for a sort accelerator
US7584173B2 (en) Edit distance string search
UST912008I4 (en) Method for reordering the records of a file
US5185886A (en) Multiple record group rebound sorter
US3302186A (en) Information retrieval system
US20200278980A1 (en) Database processing apparatus, group map file generating method, and recording medium
US7478119B2 (en) System and method for transposing memory patterns within the physical memory space
US6662307B1 (en) Disk recovery/reconstruction
CN113010477A (en) Method and device for retrieving metadata of persistent memory file system and storage structure
US20220171872A1 (en) Data generalization apparatus, data generalization method, and program
US3034102A (en) Data handling system
US20150331743A1 (en) Hidden data identification in solid state driver forensics
US20230237048A1 (en) Journal groups for metadata housekeeping operation
US3546686A (en) Random access system