US5142687A - Sort accelerator with rebound sorter repeatedly merging sorted strings - Google Patents

Sort accelerator with rebound sorter repeatedly merging sorted strings Download PDF

Info

Publication number
US5142687A
US5142687A US07/374,433 US37443389A US5142687A US 5142687 A US5142687 A US 5142687A US 37443389 A US37443389 A US 37443389A US 5142687 A US5142687 A US 5142687A
Authority
US
United States
Prior art keywords
records
record
input
strings
sorted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US07/374,433
Other languages
English (en)
Inventor
Richard F. Lary
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hewlett Packard Development Co LP
Original Assignee
Digital Equipment Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Equipment Corp filed Critical Digital Equipment Corp
Assigned to DIGITAL EQUIPMENT CORPORATION, 146 MAIN STREET, MAYNARD, MA A CORP. OFMA reassignment DIGITAL EQUIPMENT CORPORATION, 146 MAIN STREET, MAYNARD, MA A CORP. OFMA ASSIGNMENT OF ASSIGNORS INTEREST. Assignors: LARY, RICHARD F.
Priority to US07/374,433 priority Critical patent/US5142687A/en
Priority to CA002017900A priority patent/CA2017900A1/en
Priority to DE69032828T priority patent/DE69032828T2/de
Priority to AT90305934T priority patent/ATE174696T1/de
Priority to EP90305934A priority patent/EP0405759B1/de
Priority to JP2172514A priority patent/JPH0776910B2/ja
Publication of US5142687A publication Critical patent/US5142687A/en
Application granted granted Critical
Assigned to COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. reassignment COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: COMPAQ COMPUTER CORPORATION, DIGITAL EQUIPMENT CORPORATION
Assigned to HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. reassignment HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: COMPAQ INFORMATION TECHNOLOGIES GROUP, LP
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/22Arrangements for sorting or merging computer data on continuous record carriers, e.g. tape, drum, disc
    • G06F7/36Combined merging and sorting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline or look ahead using a slave processor, e.g. coprocessor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99937Sorting

Definitions

  • This invention relates to a sort accelerator and more particularly to a sort accelerator in which a rebound sorter is used as a merger and with which large numbers of records can be rapidly and efficiently sorted.
  • the accelerator of the invention operates independently of a host computer and is highly reliable, while using a minimum number of components and being otherwise economically manufacturable.
  • a rebound sorter developed by IBM includes a set of processing elements which can be visualized as forming a U-shaped pipeline with data flowing down along a left side and up along a right side.
  • the processing elements operate concurrently to speed up the sorting process.
  • Each element effects a comparison of data from two records and, depending upon the results of the comparison, effects either vertical movements down along the left side and up along the right side or horizontal exchanges such that "heavier” records are moved to the left and downwardly while “lighter” records move to the right and upwardly to eventually exit the upper end of the right side in sorted order.
  • Rebound sorters of this type have advantages in speeding up sorting operations, but the cost of the required hardware has precluded extensive use of such sorters, particularly for sorting large numbers of records.
  • Sorters in which processing operations are performed concurrently by a number of elements are advantageous in speeding up sorting operations, but are limited in that they only allow sorting of a limited number of records.
  • IBM sorter for example, allows sorting of a group of records containing a maximum of one record more than the total number of sorting elements. Thus fifteen processing elements sort a group which contains a maximum of sixteen records. For large sorts, the amount and cost of hardware using such a system would be prohibitive.
  • a rebound sorter in which a limited number of processing elements are operated concurrently to sort a relatively small number of records into a group.
  • the same rebound sorter is then also used to perform merges of sorted groups into a larger group containing all of the records of the smaller groups in sorted order.
  • the same hardware that sorts sixteen data items for example, is also used to perform a sixteen-way merge. As the number of records to be sorted goes up, increasing numbers of processing elements are not required to accommodate a large sort.
  • a sort accelerator which incorporates the invention may typically be connected to a host computer and a host memory through an address and data bus.
  • the sort accelerator operates independently of the host processor to sort the records at a high speed and to store the sorted records in the host memory.
  • the independent operation of the sort accelerator enables the host computer to freely perform other operations as the sorting and merging operations take place.
  • the sort accelerator includes an input section, an output section, a sort control section, and a sorting section.
  • the sort control section operates to control the input section to effect sequential feeding of groups of records to the sorting section and control the output section to effect storage of a plurality of sorted groups, either in a local working memory or in a working portion of the memory of a host computer.
  • the input, sorting and output sections are controlled from the sort control section to receive groups of records in unsorted order and to then effect sorting and temporary storage of such groups of records in sorted order. It then operates to merge such groups of sorted records into one larger sorted group or string.
  • the sorting section of the sort accelerator can in one time period sort a maximum of a certain number of records in either descending or ascending order, the certain number being 16 in an illustrated embodiment. Once sorted groups of 16 have been created, those groups can be merged through a series of operations into one large sorted string. Once 16 large sorted strings have been created, they are then merged into a longer string. This process is repeated, producing 16 times larger strings each pass until all records have been sorted. With enhancements, this method sorts at a rate proportional to a value ⁇ N logw where "N" is the number of records and "w" is one more than the number of processing elements.
  • the sorting section or rebound sorter need contain only 15 processing elements.
  • a processing element compares key bytes of a record to determine order.
  • Processing elements are connected together via two record storage elements.
  • the arrangement of processing elements and record storage elements can be viewed as a vertical column where new input records enter the top left-hand side of the column and sorted records exit from the top right-hand side of the column.
  • Records are sorted into groups of 16 and stored in a storage section or workspace memory. Once a sufficient number of groups have been sorted, the merge operation begins. To merge, each record is tagged to indicate what sorted group it belongs to. For ascending sorts, the smallest record of each group is placed in the rebound sorter until the smallest record is pushed out of the rebound sorter. This record is the smallest record amongst all of the smallest records of each individual sorted group. The next record fed into the rebound sorter is chosen sequentially from the group which previously contained the smallest of all records. This new record must be either equal to or greater than the first smallest record since it comes from a sorted group. This procedure continues so that a new record is chosen from the group in which the last output record came from. Once all the strings are exhausted, the merge is complete yielding one large group of sorted records.
  • Two methods of detecting the integrity of data as it passes through the system are used.
  • byte parity is used to detect errors in data transmission
  • Eight bits of data are represented by a 9 bit value.
  • the extra bit carries a parity value calculated at the data source.
  • Record integrity is checked by re-calculating the parity value from the transmitted data and by comparing the newly calculated value with the transmitted parity bit. An inequality indicates a transmission error.
  • the second method used to insure data integrity involves using a two level checksum scheme.
  • the checksum is calculated over multiple bytes as each record enters the rebound sorter.
  • the checksum is later re-calculated as records exit the enhanced rebound sorter and compared to the previously calculated checksum. This comparison checks for two types of errors which can occur in the rebound sorter. The first type of error results from a single bit failure in the data path (including storage element failure) and the second results from the improperly controlled swapping of bytes between records (processing element control failure).
  • the sort order checker examines records as they exit the enhanced rebound sorter for proper sorting.
  • a special storage element acting as a shift register holds one record and in cooperation with comparison logic determines whether the sorted records have been properly ordered.
  • the output of the comparison logic indicates whether the rebound sorter has improperly sorted the records.
  • the sort order checker also provides a "tie" bit indicating that keys of adjacent records are equal.
  • User software can utilize the "tie” bit to perform sorts on key fields which exceed the capacity of the sort accelerator or to assist the user in performing post-sort elimination of duplicate records.
  • a sort accelerator constructed in accordance with the invention further includes in a preferred embodiment, features which achieve fast and stable sorting of large numbers of records while minimizing hardware requirements.
  • One particular feature is the pipeline control for performing multiple sort/merge operations in a sort accelerator.
  • the pipeline control allows groups of records to be sorted without mixing records from different groups and without having to flush the rebound sorter with dummy records before a new group of records can be sorted by the rebound sorter.
  • the pipeline control allows the sort accelerator to begin pushing a new group of records into the rebound sorter immediately after the last record of the previous group has been examined by the first processing element.
  • Pipeline control is accomplished by a series of shift registers in parallel with the rebound sorter. Whenever a new group of records enters the rebound sorter, a boundary value is set in the first pipeline shift register indicating to the processing elements that a new set of records has entered the rebound sorter. The boundary value is shifted through the series of shift registers along with the first record as it passes through the processing element. The value indicates to the processing elements that records following the boundary value must not be compared with records from other groups. In this way, the rebound sorter can remain fully loaded with records, allowing the overlapping of loading and unloading of records from any number of different groups of records.
  • a stable sort is one where records with equal keys exit the sort accelerator in the same relative order as they entered. Stable sorting is required by applications where the order of records with equal keys has already been determined. Additional hardware has been added and algorithms have been modified to incorporate stable sorting in the rebound sorter.
  • Stable sorting is implemented by setting bits which identify a record by the group from which it came. This group designation forces records from different groups having equal keys to exit the rebound sorter in the order in which they entered. An "order" bit causes records from the same group with equal keys to exit the rebound sorter in the order in which they entered.
  • Still another feature of the sort accelerator relates to a merge lookahead/memory management arrangement through which the required amount of workspace memory need only be approximately 10% greater than the amount of memory required to store the data. Not only does this feature provide for efficient use of memory, but it also provides linear expansion of workspace memory, thereby simplifying host system management of the workspace. Optimal merging reduces the number of times data is processed, and consequently reduces the time it takes to perform the merge/sort algorithm.
  • Allocation of workspace storage is managed by the sort control section.
  • the sort control section determines whether enough storage has been allocated for the sorted data.
  • an extended merge/sort algorithm controls the sequencing of sorts and merges through the use of an input and an output phase.
  • the input phase consisting of sorts and merges, continues until all of the unsorted records that make up the input string have been processed by either sorting a group of records or merging groups of records. Once all the input records have been either sorted or merged, the output phase begins.
  • the output phase consists of a series of merges followed by a final merge to complete the sorting of the records.
  • the sort accelerator utilizes a string number or tag extraction lookahead feature.
  • This feature of the present invention utilizes the information produced by the sort control section which indicates which input string a given record is from. This information is necessary to allow the sort control section to deliver the next record to the rebound sorter. As previously described, a proper merge requires that the next input record must come from the same group as the last output record. If the string information is delivered to the sort control section too late, the rebound sorter will stall while waiting for the next input record. In most cases, the tag lookahead provides the tag information to the sort control section before the next sorted record exits the rebound sorter. This allows the next input record to be waiting at the input to the rebound sorter so that no delay occurs in the merge operation.
  • Tag lookahead logic determines the group of the smallest record (for ascending sorts) as early as practical before it is actually output from the sorter. This allows the sort accelerator to begin accessing the next input record before the next record has been output. Using this technique, the inter-record delay of merges can be totally eliminated except for very short records or in cases where the sort decision is made very late in the record.
  • One additional feature of the sort accelerator is the use of circulating RAM indexing to implement a variable length shift register used as the record storage element.
  • a RAM is used to hold record data as it passes between processing elements.
  • the length of a one bit variable length shift register is programmed to allow for varying record sizes. As the bit is shifted through the variable length shift register, it enables each row of the RAM in sequential order, and the RAM either reads or writes data. Consequently, this feature implements a shift register in a way that allows the shift register to vary in length while using a minimal amount of logic. This method also avoids the use of shift registers for the data which are slower and require more power to operate.
  • FIG. 1 is a schematic block diagram of a sort accelerator constructed in accordance with the invention, shown connected to a host computer and host memory;
  • FIG. 2 shows the format of a record which is developed from an input record, for transmission to an enhanced rebound sorter of the accelerator
  • FIG. 3 is a schematic diagram showing a rebound sorter of the accelerator of the invention in a simplified and generalized form and portions of associated storage and pipeline circuits;
  • FIGS. 4A and 4B each provide an illustrative simplified example of the flow of data through a sorter such as depicted in FIG. 3:
  • FIG. 5 is a schematic diagram providing a more detailed showing of the construction of the enhanced rebound sorter and an associated storage circuit used in the sort accelerator of FIG. 1;
  • FIG. 6 shows logic circuitry of a control circuit of a processing element of the rebound sorter shown in FIG. 5:
  • FIG. 7 shows one form of RAM index circuit usable with the storage circuitry shown in FIGS. 1 and 5:
  • FIG. 8 shows another form of RAM index circuit usable with the storage circuitry shown in FIGS. 1 and 5;
  • FIG. 9 shows a portion of a record control circuit of the accelerator of FIG. 1, operable for supplying record timing signals
  • FIG. 10 shows another portion of the record control circuit of the accelerator of FIG. 1, operable for supplying tag timing signals
  • FIG. 11 shows details of a pipeline control circuit shown generally in FIG. 1 and FIG. 5;
  • FIG. 12 shows tag selection circuitry of a string selection logic circuit of a sort sequencer of the accelerator of FIG. 1;
  • FIGS. 13 through 17 in combination show various portions of tag lookahead logic circuitry of the accelerator of FIG. 1;
  • FIG. 18 provides an illustrative example of an organization of intermediate storage which is used by the sort accelerator of FIG. 1;
  • FIGS. 19 to 22 together illustrate string number assignments used for obtaining optimal merging operations
  • FIG. 23 shows checksum logic circuitry of an interface of the enhanced rebound sorter of FIG. 1;
  • FIGS. 24 through 27 in combination show portions of logic circuitry of a sort order checker of the interface of the enhanced rebound sorter of FIG. 1;
  • FIG. 28 shows an additional portion of logic circuitry of the sort order checker, used for generating a signal indicating that two consecutive records have the same keys.
  • Reference numeral 10 generally designates a sort accelerator which is constructed in accordance with the principles of the invention.
  • the sort accelerator 10 is shown connected to a host computer 11, a host memory 12, and a workspace memory 18, through an address and data bus 13 and is arranged to receive command data from the host computer 11 to sort records which are stored in the host memory 12 and to store the sorted records in the memory 12.
  • the command data may include, for example, the start address in memory 12 of one group of records to be sorted or the start addresses of records to be merged, the length of the records in each group, the length of a key field in each record, the desired mode of sorting, the number of records in the group, and the start address in memory 12 in which sorted records are to be stored.
  • the sort accelerator 10 Upon receipt of such command data, the sort accelerator 10 operates independently of the host processor to sort or merge the designated records at high speed and to store the sorted records in the host memory 12. The host computer is free to perform other operations after applying a command signal to the sort accelerator 10.
  • the illustrated sort accelerator 10 includes a local address and data bus 14 which is connected through a system interface 15 to the host bus 13.
  • a local memory 16 which provides workspace is shown connected to the local bus 14.
  • the local memory 16 is optional but working memory is required by the illustrated system. In place of or in addition to the illustrated local memory 16, a portion of the host memory 12 may be used to provide working memory in place of that provided by the local memory 16.
  • the local bus 14 is connected to a rebound sorter 18 through an interface 20 which connects to input section 21, an output section 22 and a sort sequencer 34.
  • input section 21 includes a parity checker 23, a buffer 24, a LW (long word) to byte unpacker 25, a tag inserter 26 and a check parity/checksum calculator 27.
  • the output section 22 is connected to the sorter 18 through an order checker 29 and includes a checksum calculator and parity generator 30, a byte to long word packer 31, long word buffers 32 and a parity check/generate circuit 33.
  • Sort sequencer 34 receives command data sent from the host processor 11 and which monitors . and controls input, sorting, storing, merging and output operations as hereinafter described.
  • Sort sequencer 34 includes a register array 35 and an ALU 36 which is connected to an instruction register 37 to operate from a microprogram stored in a ROM 38.
  • the ALU 37 operates to effect supervisory control of input, sorting, merging and output operations.
  • Sort sequencer 34 also includes a sort sequence, tag lookahead and miscellaneous control circuitry 39 which operates to select addresses sent to the input section 21 for rapid fetching of records from memory during sorting and merging operations and to perform other functions.
  • the enhanced rebound sorter 18 is designed to effect simultaneous processing operations on a group of 16 records through examination of the keys of such records. It includes fifteen sections 40-54 in which such processing operations are performed, the number of sections being one less than the number of records of the group being processed. Each of the sections 40-54 includes a processing element and associated record storage elements.
  • Features of the rebound sorter 18 relate to the implementation of record storage elements through the use of a storage circuit 56 which includes a RAM and read and write circuits and which operates under control of a RAM index circuit 58.
  • a record control circuit 62 applies timing signals to the processing elements of the sections 40-54 and to the pipeline control circuit 60.
  • Additional features relate to the use of the enhanced rebound sorter 18 for both sorting and merging operations in a manner such as to achieve fast sorting of large numbers of records while minimizing hardware requirements and achieving other advantages.
  • a specific feature relates to identifying the individual records during processing operations in a manner such as to facilitate stable and reliable sorting of records and merging of sorted records as hereinafter described.
  • command data is initially sent through the interface section to the sort sequencer 34.
  • the sort sequencer 34 develops and stores corresponding control data in the register array 35, including the starting address of a first record of each group of records to be input, the length of the records of the group, the length of a key field of each record, the number of records and the starting address and other data with respect to where sorted records are to be stored.
  • an operation is then initiated in which the input section 21 operates using the address circuit 23 to fetch record data from the host memory 12, followed by another operation by the input section 21 which then uses the byte inserter 25 to develop modified records.
  • a continuous stream of modified records may be sent to the rebound sorter 18, each modified record consisting of a certain even number of serial bytes.
  • each modified record may have a form as shown in FIG. 2, including one or more leading key bytes K which enter the sorter first and which are followed by a tag byte T (added by the byte inserter 25) and one or more data bytes D.
  • a trailing pad byte P is added by the byte inserter 25 to each record if it is necessary to obtain an even number of bytes in the record.
  • each record includes an "order" bit which is used to insure a highly stable sorting operation as also discussed hereinafter, the "order" bit being preferably included in the tag byte.
  • a sorting operation is performed in which a group of 16 records are shifted into the illustrated rebound sorter 18. Immediately upon shifting of the last record of a group into the rebound sorter 18, the records of the group of 16 start to exit the rebound sorter serially in sorted order. A sorted string of 16 records is thus formed which is sent by the output section 22 into a section of the workspace memory 16.
  • pipeline control circuit 60 to enhance the speed of operation. As each sorted group of 16 records starts to exit the rebound sorter 18, another sorting operation is initiated in which another group of 16 records is in effect simultaneously shifted into the rebound sorter 18.
  • the pipeline control circuit 60 is designed to permit this operation without mixing the records of the two groups.
  • a third group of 16 records is shifted into the rebound sorter 18, followed by a fourth and subsequent groups of 16 records until 16 strings of 16 sorted records each are stored in the workspace memory 16.
  • an "up-merge" operation is performed, as hereinafter discussed in detail, in which records from the 16 sorted strings of 16 records each are fetched in a certain order and shifted into the rebound sorter 18 to produce a single string of 256 sorted records which is stored in the workspace memory.
  • a merge can be started immediately following the last record or a sort or another merge.
  • records may be sorted into a number of strings all containing 4096 records, except the final string which may contain a lesser number depending upon the length of the input string. The number of such strings which may be produced is limited only by the available memory.
  • an output phase is initiated in which merges are performed as may be necessary to reduce the number of strings to 16 or less, followed by a final merge in which the sorted records are written directly to the host memory 12 starting at the addresses designated in the initial command from the host computer 11.
  • merges are performed as may be necessary to reduce the number of strings to 16 or less
  • final merge in which the sorted records are written directly to the host memory 12 starting at the addresses designated in the initial command from the host computer 11.
  • Still other relate to management of the storage in working or intermediate memory of strings to be merged and to the manner of supplying stored strings from intermediate memory to the rebound sorter 18 for merging.
  • the sort accelerator 10 also includes features which relate to checking operations, including parity and checksum checks which are so performed as to allow optimum sorting and merging operations while detecting processing errors and insuring integrity of the sorted records which are output from the sort accelerator 10.
  • FIGS. 4A and 4B provide an illustrative simplified example of the flow of data through the sorter. A more detailed showing of the construction of the rebound sorter 18 is provided by FIG. 5, individual circuits of the sorter 18 being shown in other figures of the drawings.
  • the sorter 18 operates to arrange records in an order according to the numeric values assigned to keys of the records which may typically be bytes providing ASCII representations of names or numbers.
  • the sorter may operate to sort the records in either ascending or descending order, as desired.
  • the terms "larger” and “smaller” are used when referring to comparisons of record keys. It will be understood that the actual key values associated with “larger” and “smaller” keys will depend on whether the sort being performed is ascending or descending and the use herein of the terms “larger” and “smaller” should not be construed as a limitation on the invention.
  • the sorter 18 is an N-1 element rebound sorter which is capable of effectively comparing N record keys.
  • N is 16 in the illustrated embodiment in which N-1 processing elements are indicated as PE 0 -PEN -1 and are included in the sections 40-54 shown in FIG. 1.
  • Key comparisons are performed one byte at a time, with all of the N-1 processing elements or operating in parallel. Comparisons of bytes of keys continues until bytes from keys being compared are unequal. At that point, a decision is made as to which key is the larger, based on the unsigned binary value of the bytes. For ascending sorts, the larger key is the one with the larger unsigned binary byte value. For descending sorts, the larger key is the one with the smaller unsigned binary byte value.
  • the N-1 element rebound sorter can be viewed as a black box with the following characteristics. After being loaded with N-1 records, whenever a new record is put into the box, the smallest of the new record and N-1 records already in the box emerges from the box.
  • a rebound sorter consists of the aforementioned N-1 processing elements or connected via inter-element record storage sections each designated as an IERS, two IERS's being associated with each PE.
  • Each PE accepts 2 input bytes and produces 2 output bytes every time period.
  • Each IERS accepts 1 byte and produces 1 byte every time period.
  • Each IERS holds half of a record and operates as a shift register.
  • a byte input to the IERS is produced at the output R/2 time periods later, where R is the number of bytes in a record.
  • the arrangement of processing elements and inter-element record storage can be viewed as a vertical column where new records enter and exit the column at the top left and right of FIG. 3.
  • the IERS elements are connected to the PEs in the following fashion:
  • IERSm The output of IERSm is the input to PEm+1 going down, except for IERSN-2 which is the input to IERSN+N-3.
  • the input to the rebound sorter is the input to PE0.
  • Each PE begins processing when the first byte of a record coming down and a record coming up the column are presented to it. If the record coming down is larger than the record coming up, it continues down the column while the record coming up continues up the column. If the record coming down is smaller than the record coming up, it is sent back up the column while the record coming up is sent back down the column. Since each IERS contains only half a record, the even numbered PEs are half a record out of phase with the odd numbered PEs.
  • the pipeline control of the accelerator of the invention controls the sequencing of the PEs so they know when new records are presented, when the end of the key is reached, and whether or not to compare them at all ("boundary" condition).
  • FIGS. 4A and 4B a simplified three element rebound sorter is shown sorting a set of four 2-byte records, it being noted that the illustrated accelerator 10 requires a minimum of 4 bytes, two bytes being used in this example for simplicity.
  • the simplified rebound sorter as depicted, every 2 digits being a record:
  • the key is the entire record.
  • PE states are indicated by "DS”, “PV”, “PH”, “FV”, and "xx" symbols shown within the PE's:
  • the initial state of the PE's is "FV" to load the data into the IERS. After the 4th clock, the lowest PE element enters the "DS" (deciding) state, and the first possible comparison between two records may occur.
  • processing elements PE0, PE1 and PEN-2 referred to in the foregoing discussion are contained in the sections 40, 41 and 54.
  • the processing element of the section 40 includes two multiplexers 65 and 66, a control circuit 67 and a comparator 68.
  • a "UL" input line 69 at the upper left is connected to one input of the multiplexer 65 and one input of the multiplexer 66.
  • a second "LR" input line 70 is connected to a second input of the multiplexer 65 and a second input of the multiplexer 66.
  • the outputs of the multiplexer 65 and 66 are respectively connected to a lower left or "LL" output line 71 and an upper right or “UR” output line 72.
  • the multiplexers 65 and 66 are controlled by a control circuit 67 which is controlled from the output of the comparator 68, the inputs of comparator 68 being connected to the "UL" line 69 and the "LR" line 70.
  • the multiplexers 65 and 66 are so controlled that the bytes pass from the "UL" line 69 downwardly to the "LL” line 71 while bytes pass upwardly from the "LR” line 70 to the "UR” line 72.
  • the bytes input on the "UL” line 69 are transmitted through multiplexer 66 to the "UR” line 72 while the bytes input on the "LR” line 70 are transmitted to the left through multiplexer 65 to the "LL” line 71.
  • the lines 69 ⁇ 72 provide plural signal paths so as to transmit all bits of a byte in parallel, the bytes being transmitted one by one in byte-serial fashion.
  • Each processing element receives the following information during each time period:
  • Each processing element maintains the following internal state information:
  • each processing element Based on the information received and the current internal state information, each processing element produces the following information during each time period:
  • the processing element If the processing element is not being forced to pass vertically, then the following applies. If a decision has already been made, bytes continue to be passed vertically or horizontally based on the decision made previously. Otherwise, the current input bytes are compared. If they are unequal a decision is made at this time. If the byte coming up is larger than the byte coming down, then the byte coming up is passed down while the byte coming down is passed up (pass horizontal). Otherwise, the byte coming up is passed up while the byte coming down is passed down (pass vertical). When a decision is made, the processing element remembers whether it passed vertically or horizontally, in order to pass the remaining bytes of each record in the same direction.
  • the state of the ascending/descending input is used to determine the sense of larger and smaller in the comparison.
  • Each inter-element record storage section or IERS of FIG. 3 is a variable length shift register containing a half record of bytes. For a record length of R, a byte input to an IERS at time period T, will be produced at its output at time period T+R/2. This means that the shift register must be of length R/2. As aforementioned, the user record has at least one byte added (the tag byte) and if the resultant length is odd, a pad byte is also added.
  • each shift register is implemented by using RAM sections of the storage circuit 56 in combination with latches which provide a stable interface to comparator logic.
  • the first shift register IERS0 of FIG. 3 is implemented by a RAM section 74 and a read-write circuit 76 which has a write input connected to the "LL" output line 71 and which has a read output connected through an output latch circuit 78 to the "UL" input line 69' of the next section 41.
  • the output latch circuit 78 forms an extra row of the shift register and insures stable conditions for comparison of key bytes. Since the latch for the extra row and at least 1 RAM row must be in the data path, the minimum size of the shift register is 2 rows. This sets the minimum record length for the rebound sorter at 4 bytes.
  • the RAM has a number of rows equal to half the maximum record size minus 1, and a number of columns equal to 2 times the number of processing elements times 8. In the current implementation, the number of processing elements is 15, making each row 2 ⁇ 15 ⁇ 8, or 240 bits wide. Since the maximum record size is 40, there are (40/2)-1, or 19 rows. This makes a total RAM size of 4560 bits.
  • the shift register IERSN-1 of FIG. 3 is implemented by a RAM section 80 and a read-write circuit 82 which has a write input connected to the "UR" output line of the section 41 and which has a read output connected through an output latch circuit 84 connected to the line "LR" line 70 of the processing element of section 40.
  • the output latch circuit 84 like the latch circuit 78, forms an extra row of the shift register.
  • the processing elements and shift registers of the sections 41 and 54 illustrated in FIG. 5 are substantially identical to those of the section 40, corresponding components being indicated by primed and double-primed numbers.
  • the section 41 is an exception, in that it is shown as including a tag lookahead circuit 100, operative to avoid processing interruptions in a manner as hereinafter described in connection with FIGS. 13-17.
  • the row enables of the RAM sections 74 and 80 and those for the other sections 41-54 of the sorter 18 are driven from a RAM index circuit, two circuits 58 and 58' of different forms being shown in FIGS. 7 and 8, as described hereinafter.
  • the sections 41-54 of the sorter 18 are connected to the pipeline control circuit 60 which includes elements arranged to store boundary indications, such indications being shifted from one element to another in synchronism with the shifting of records in the sorter 18. Certain of the elements apply "FORCEV" signals to the control circuits of the processing elements to cause the processing elements to shift only in the vertical direction during certain conditions and to avoid mixing of records of one group with records of another group.
  • the pipeline elements which apply the "FORCEV" signals to the sections 40, 41 and 54 are indicated by reference numerals 91, 92 and 93 in FIG. 5, such elements being also operative for shifting of boundary condition signals. Additional pipeline elements are used solely for shifting of boundary condition signals, the pipeline elements for the sections 54, 41 and 40 being respectively indicated by reference numerals 94, 95 and 96. The operation of the pipeline elements is described in detail hereinafter in connection with FIG. 11.
  • FIG. 6 shows logic circuitry of the control circuit 67 of the section 40, similar logic circuitry being included in the control circuits 67' and 67" and the control circuits of the other sections of the sorter 18.
  • the aforementioned "FORCEV" signal is applied through a line 101 to one input of an OR gate 102 which develops an output on a "NEW PASS V" line 103.
  • a second input of the OR gate 102 is connected to the output of a multiplexer 104 which is controlled by a "DECIDING" signal on a line 105 connected to the output of a latch 106.
  • the latch 106 is controlled through an OR gate 108 either from a "E/OREC” signal on a line 109 or from a signal applied from the output of an AND gate 110 which has three inputs.
  • One input of "AND” gate 110 is connected to the "DECIDING" line 105.
  • the third input of the AND gate is connected to a "-E/OTAG" line 112.
  • the multiplexer 104 has one input connected to a "PASS V" line 113 which is connected to the output of a latch 114 to which the "NEW PASS V" signal on line 103 is applied.
  • One input of the multiplexer 118 is connected through a line 119 to a "UL>LR" signal line from the comparator 68.
  • a second input is connected to a "UL ⁇ LR" signal applied on line 120 from the comparator circuit 68.
  • the multiplexer 118 is controlled from the output of an OR gate 122 which has input lines 123 and 124 connected to which "ASCENDING" and "E/OTAG” signals are applied.
  • the row enables of the RAM sections 74 and 84 and those for the other sections 41-54 of the sorter 18 are driven from a RAM index circuit, two circuits 58 and 58' of different forms being shown in FIGS. 7 and 8.
  • the RAM index circuit 58 of FIG. 7 comprises a single decoder 131 and a counter 132.
  • the counter is loaded through a count input line 133 with a count of U/2, counts down to 1 after being loaded with each count, and then the operation is repeated.
  • the output of the counter 132 is fed through the decoder 131 and through a multiconductor output line 134 to the RAM sections 74 and 84 for section 40 and to the corresponding RAM sections of the other sections 41-54.
  • the outputs of the decoder 131 then drive the RAM row enables, first for the read and then the write before advancing to the next row.
  • the initial value at startup and after the counter reaches 1 (U/2), comes from the all but the low order bit of the user specified record size (divide by 2). It should be remembered that the user record has at least 1 byte added (the tag byte) and if the resultant length is odd, a pad byte is also added. This means that if the user record size is U, then U/ 2
  • the RAM index circuit 58' of FIG. 8 reduces the minimum cycle time of the addressing logic at the expense of additional logic.
  • the circuit 58' drives the RAM row enables of the storage circuit 56 from a 1 bit variable length shift register 136 which has a number of stages corresponding to the maximum length of a half record with outputs connected through a multi-conductor line 134' to the row enables of the storage circuit 56.
  • the stages of shift register 136 are configured in a ring.
  • the length of the shift register is configured to be half the record length minus 1 through the application of initial control data applied through a multi-conductor line 133' to a decoder 138, such control data being applied after the record length has been determined from control data supplied from the host processor.
  • a single bit circulates around the shift register ring providing row enable for first the read and then the write before advancing to the next row.
  • the input of the decoder 138 is all but the low order bit of the user record length, which is half the record length input to the sorter 18 minus one.
  • the outputs of the decoder are connected to the shift register, such that the exit end of the shift register is connected to the decoder output that is asserted when a 0 is input to the decoder.
  • the next bit of the shift register (the one that shifts into bit 0) is connected to the decoder output that is asserted when a 1 is input to the decoder, and so on.
  • decoder 138 loads a single bit into the shift register that will shift out of the shift register after N shifts, where N is half the record length minus one. This is done at initialization and whenever the bit exits the shift register, thus forming the ring. Between shifts, the enabled row is first read and then written.
  • FIGS. 9 and 10 show portions of the record control circuit 62.
  • a portion 62A shown in FIG. 9 is used primarily for record timing while a portion 62B shown in FIG. 10 is used primarily for tag timing. These portions produce timing information for the processing elements and the pipeline control.
  • the record control circuits of FIGS. 9 and 10 receive the following information during initialization:
  • the record control circuits of FIGS. 9 and 10 include components and maintain internal state conditions and information, as follows:
  • a record byte counter, "RCTR" 143 in FIG. 9, which is initialized to U/2 and decremented every time period by a load/decrement signal derived from the counter value (load at initialization or when RLTR 1, decrement otherwise). If decremented past zero, counter 143 is reset to U/2.
  • the counter 146 is decremented every time period by the signal on line 144. If counter 146 is 0 when decremented, it is set to U/2through a signal applied from line 141 through multiplexer 148.
  • the value "RCNT638 indicates which half record is in progress in each PE. It is initialized to false indicating that the first half of records are entering the even PEs, and the second half of records are entering the odd PEs. It changes whenever the record byte counter is decremented past zero. If it is false it becomes true, and if it is true it becomes false.
  • a boolean value "TCNT" indicating whether the odd or even PE's will see a tag byte next, produced on a line 156 in FIG. 10 at the output of a latch circuit 157 which is connected to the output of an exclusive 0R circuit 158 having one input connected to a "TCTR 0" line at the output of counter 146 and having a second input connected to the line 156.
  • the record control Based on the internal state information, the record control produces the following outputs:
  • a boolean value "EREC” indicating that the last byte of a record is being presented to the even processing elements, produced on a line 160 which is connected to a latch circuit 161 at the output of an AND circuit 162 having inputs connected to the "RCNT” line 150 and to the "RCTR 0"60 output of the "RCTR” counter 143.
  • a boolean value "OREC” indicating that the last byte of a record is being presented to the odd processing elements, produced on a line 164 which is connected to a latch circuit 165 at the output of an AND circuit 166 having inputs connected to the "RCTR 0" output of counter 143 and to a "-RCNT” line 167 which is connected through an inverter 168 to the "RCNT" line 150.
  • a boolean value "ETAG” indicating that the first byte following the record key, i.e. the "tag” byte, is being presented to the even processing elements, produced on a line 174 which is connected to a latch circuit 175 at the output of an AND circuit 176 having inputs connected to the "TCNT" line 156 and to the "TCTR 0" output of the "TCTR” counter 146.
  • a boolean value "OTAG” indicating that the first byte following the record key is being presented to the odd processing elements, produced on a line 178 which is connected to a latch circuit 179 at the output of an ASND circuit 180 having inputs connected to the "TCTR 0" output of counter 146 and to a "-TCNT” line 181 which is connected through an inverter 182 to the "TCNT" line 150.
  • Very important features relate to the pipeline control circuit 60 which allows the rebound sorter to be loaded and emptied, and prevents separate groups of records from becoming mixed.
  • the pipeline element 91 which is associated with the processing element of the first section 40 of the sorter 18, includes an input line 186 to which a signal is applied in synchronized relation to the input of a new record to the sorter.
  • the input line 186 is connected to one input of a multiplexer 188 having an output connected to a latch 189, the output of the latch being connected to a line 190 which is connected to a second input of the multiplexer 188.
  • a "FORCEVO" signal is developed on a line 192 at the output of a latch 193 which is connected to the output of a multiplexer 194, one input of the multiplexer 194 being connected to the line 192 and a second input thereof being connected to the output of an OR gate 196.
  • Inputs of the OR gate 196 are connected to the lines 186 and 190 and into a "FORCEVl" line 192' of the next-following stage.
  • the multiplexers 188 and 194 are controlled by the "PCADV" signal applied through line 170 from the record control circuit 162A of FIG. 9.
  • the circuits of all other pipeline elements associated with the sorter processing elements are substantially identical to that of the element 91. Except that line 186 comes from “BNDm-1"60 and in 93 there is no "FORCEVm+1" input.
  • the circuits of the pipeline elements 94 and 96 are also shown in FIG. 11.
  • the element 94 includes a multiplexer 198 and latch 199 which correspond to the multiplexer 188 and latch 189 of the element 91.
  • the element 96 includes a multiplexer 198' and a latch 199', and a final output signal "NEWSTREAMOUT" being developed on a line 200.
  • the pipeline control circuit 60 receives the following information during each time period:
  • the pipeline control circuit 60 maintains the following internal state information:
  • the pipeline control Based on the information received and the current internal state, the pipeline control produces the following information during each time period:
  • the pipeline control updates its internal state information every time the first byte of a record is presented to either the even or odd processing elements ("PCADV",every half record).
  • the "boundary" value enters the rebound sorter at the top, moves down the left side, across the bottom, and up the right side, in parallel with the records.
  • This operation causes all PEs at or above a boundary to "force vertically”. It prevents mixing of records on different sides of a boundary, but allows records on the same side of a boundary to be sorted.
  • the sorter 18 with no internal boundaries can be thought of as a "magic sorting box", such that a record is pushed into the box, the smallest of all the records in the box (including the record pushed in) pops out.
  • the sorting and merging operations of the accelerator 10 will be best understood by recognizing and analyzing the problem of having N sorted strings of data to be merged into one large sorted string, using an N-1 element rebound sorter such as the sorter 18.
  • N-1 element rebound sorter such as the sorter 18.
  • the output of the sorter 18 when operated under this algorithm will be the merge of the N input strings; this can be proven by noting that:
  • Each record output from the sorter 18 is the smallest of all records currently in the sorter.
  • the illustrated sort accelerator keeps track of which input string a given record in the sorter came from by using the byte inserter 25 of the input section 21 to insert a tag byte into each record as it is being fed into the sorter.
  • the tag byte is preferably inserted in a position immediately following the last byte of the key field and preceding the first byte of the data field.
  • the tag byte of a record contains an index that describes which input string of this merge operation contained the record.
  • An 8-bit tag byte would allow up to 256 strings to be input to a merge; actually, one bit of this byte is used to help implement and insure a stable sorting operation as hereinafter described, so that a maximum of 128 strings may be merged.
  • the illustrated embodiment is limited to merging 16 strings at a time by the number of processing elements in the sorter (15), requiring only 4 bits for the string identification.
  • the data packer 31 of the output section 22 removes the tags previously inserted by the input section 21 from the records as they are output from the sorter, so that the tags do not occupy any space in storage.
  • the user may optionally specify that tags should be left in the records when the records are output to the user's buffer by the final merge pass.
  • the tag byte is examined to determine the input string from which the record came.
  • This string number is fed into the string selection logic circuit 39 of the sort sequencer 34, and the sort sequencer 34 is controlled by microcode in ROM 38 to use the output of the string selection logic circuit 39, to determine the address of the next record to input to the sorter; that record address, along with the string number produced by the string selection logic circuit 39, is given to the input section 22 to start reading the next record.
  • the string selected to provide the next input record is the same as the string whose index was in the tag of the current output record. The exception is when the prospective string is exhausted.
  • the string selection logic circuit 39 operates using a file of valid bits containing one bit for each input string and using circuitry to perform functions as now described. A portion of such circuitry is shown in FIG. 12 which is described hereinafter in connection with the operation of tag lookahead logic shown in FIGS. 13-17.
  • the file of valid bits is set to all zeros at the beginning of a merge operation to show that none of the input strings has been exhausted.
  • the tag from the output record indexes into the file of valid bits to see if the designated string is exhausted (i.e. the bit is a one); if not, that string number is output to the sort sequencer.
  • a priority encode is performed on the file of valid bits to find the lowest-numbered string that is not exhausted, and the number of this string is sent to the sort sequencer. If all input strings are exhausted a flag is set which informs the sort sequencer of this fact.
  • the sort sequencer 34 contains two files of registers in the register array 35 to maintain the current state of a merge operation; one file contains a next record address for each string, and the other file contains the final record address in each string.
  • the sort sequencer Upon obtaining a string index from the string selection logic, the sort sequencer fetches the next record address for that string from the first file and sends it to the input section 21; it also updates the next record address for that string.
  • the sort sequencer sets the valid bit corresponding to that string in the string selection logic's bit file to indicate that the string is now exhausted.
  • the string selection logic circuit 39 is also used when the number of strings to be merged is less than 16.
  • the algorithm presented above only works when the number of strings is one greater than the number of elements in the sorter.
  • a number of "null" strings (containing no records) must be added to the merge to bring the number of strings to the correct value.
  • the string selection logic circuit 39 is used to insure that an attempt to initially load the sorter with a record from one of the "null" strings (as described in the first step of the algorithm described above) will actually cause a record from one of the "real” strings to be loaded. This is done by initializing the valid bit of the "null" strings to a one instead of a zero in the string selection logic circuit 39 at the beginning of a merge operation.
  • the sort accelerator implements tag lookahead logic in order to determine the tag of the smallest record before it is actually output from the sorter.
  • the tag lookahead logic operates in conjunction with string selector logic which is shown in FIG. 12 and which is operative to select from among three possible values for the tag of the next record to be applied to the sorter 18. These three values are a "PElTAG” developed on a line 202 by logic circuitry of FIG. 17 in a manner as hereinafter described, the tag "ITAG" of an input record, applied on a line 203 and the tag of the lowest numbered stream with tags remaining, developed on a line 204 by a priority encoder 206 to which status data are applied through a line 207 (described hereinabove).
  • a multiplexer 208 is operated by a "PV0" control signal applied on a line 209 to develop a "WINTAG” signal on a line 210 from the "ITAG” and "PEITAG” signals. If “PV0" is true, “PElTAG” is selected; otherwise, “ITAG” is selected. Then the status of the "WINTAG” stream on line 210 is examined by a stream status lookup section 212 which develops an output signal on a line 213, applied to a multiplexer 214 to which the "WINTAG" and “ALTTAG” signals are applied. If the "WINTAG” stream is not empty, then "WINTAG” is used as “NEXTTAG” developed on an output line 216 of the multiplexer 214. Otherwise, the lowest numbered stream that is not empty (“ALTTAG”) is used as "NEXTTAG” on line 216.
  • the tag lookahead logic circuit 100 includes circuitry shown in FIG. 13 which is used in developing the "PElTAG” signal on line 202 and also a backup tag "BTAG" on a line 218; logic circuitry shown in FIG. 14 which is used in asserting a "DECISION" signal on a line 220 when PE0 has decided on the current record; logic circuitry shown in FIG. 15 for developing a "VALTAG” signal on a line 222 to validate a record exiting PE0; logic circuitry shown in FIG. 16 for developing a "ADVTAG” signal on a line 224 to advance a tag from the "BTAG" line 218 to the circuit of FIG. 15 to the "PElTAG” line 202; and circuitry shown in FIG. 17 which develops a "TAGSEEN” signal on a line 226 when a tag has been found in the last half of a current record exiting PE0.
  • the record to be output from the sorter always comes from PE0.
  • the inputs of PE0 are the record being input to the sorter and the output of IERS N-1, which is fed by the upper output of PEl.
  • the tag of the input record is known; the tag of the record in PEl is provided to the tag selection logic 39 by special logic in PEl ("PElTAG" on line 202 in FIGS. 12 and 13).
  • the tag lookahead logic monitors the internal state of PE0 watching for a decision to be made ("DECISION" on line 220 in FIG. 14); when it is made, the "Pass Vertical" state bit for PE0 ("PV0" in FIG.
  • the logic circuitry of FIG. 13 includes a multiplexer 228 which is operated by an inversion of the "ADVTAG” signal on line 224 from the circuit of FIG. 16 and which has one input connected to a latch circuit 229 which is connected to the output line 202.
  • a second input of the multiplexer 228 is connected to the output of a multiplexer 230 which is operated by the "OTAG" signal on line 178 from the record control circuit 62B of FIG. 10.
  • One input of the multiplexer 230 is connected to a "URl" on line 231 while a second input thereof is connected to the "BTAG" on line 218.
  • Line 218 is connected to the output of a latch 232 which is driven from a multiplexer 234, controlled from the line 178 and having inputs connected to the lines 218 and 231.
  • the tag for the record exiting PEl (“PElTAG” on line 202 of FIG. 13), is extracted as follows: If not advancing tag ("-ADVTAG”), then the previous value of "PElTAG” is latched. If advancing tag ("-ADVTAG”) on line 224 is true and a tag is exiting PEl ("OTAG" on line 178), then that tag is selected through (“URl”) on line 231. If a tag is advancing and a tag is not exiting PEl, then the backup tag (“BTAG” on line 218) is selected, the backup tag being latched by circuit 232 whenever a tag exits PEl ("OTAG").
  • the "DECISION" signal on line 220 is developed by an AND gate 236 which has one input connected to a “DECENB” line 237 and a second input connected to the output of an AND circuit 238 which has inputs connected to a "-DECIDING0" line 239 and to the "VALTAG” line 220 from the circuit of FIG. 15.
  • the "VALTAG" signal on line 220 is developed at the output of a latch 240 coupled to the output of an OR circuit 242 having inputs connected to three AND gates 243, 244 and 245.
  • the inputs of gates 243, 244 and 245 are connected to lines 220 and 226, to a "-EREC” line 246 derived from an inversion of the "EREC” line 160 of FIG. 9 and to other signals derived from the record control circuits of FIGS. 9 and 10, as described hereinabove.
  • the output line 224 is connected to a latch 250 which is connected to the output of an OR gate 252 having inputs connected to two AND gates, AND gate 253 having inputs connected to lines 160 and 226 and AND gate 254 having inputs connected to lines 146, 156 and 167 from the circuits of FIGS. 9 and 10.
  • FIG. 17 shows logic which keeps track of whether the tag has been seen during the last half of the current record exiting PE0. It includes a latch 256 which is connected to the output of an AND gate 258 having one input connected to the "-EREC" line 246. The second input of gate 258 is connected to the output of an OR gate 259 having one input connected to the line 226 and having a second input connected to an AND gate 260 which has inputs connected to the lines 150 and 156 from the record control circuits of FIGS. 9 and 10.
  • "TAGSEEN” on line 226 becomes deasserted whenever the last byte of a record exits PE0 ("-EREC”).
  • Sort sequencer 34 (FIGS. 1 & 18-22)
  • the sort sequencer calculates the external memory addresses and provides overall control of the sort accelerator.
  • the illustrated preferred embodiment of a sort accelerator uses the 16-way enhanced rebound sorter 18 as shown, but the following sections have been generalized to the use of an N-way sorter having functional characteristics equivalent to those of the sorter 18 as disclosed.
  • the sort sequencer 34 has a control processor which includes the register array 36 and the flag register 37.
  • the register array 36 may preferably be an array of 64 32-bit registers in four banks and the flag register may be a 32 bit register.
  • the control processor also includes the ALU 37 which is preferably a 32-bit ALU. In a practical embodiment, this architecture is capable of reading one register, performing an ALU operation, and writing the same register in one 133ns clock.
  • the register array serves two purposes.
  • the host processor communicates control information to the sort accelerator by initializing these registers, and the control processor uses them as the variables of its microcoded program. Addresses, counts, and status information are maintained in the array.
  • the array is organized as four banks of 16 registers each. Each bank may be indexed with a variable, allowing the tag from merge operations to be quickly translated from a string index into an address pointing to the next record to be fetched.
  • a microprogram for the control processor is stored in the ROM 38 in an array which may, for example, consist of 512 microwords of 60 bits, each microword being divided up into fields controlling the source of ALU operands, the ALU operation to be performed, the destination of ALU operations, and the location of the next microword to be executed. Additional fields in the microword may enable the writing of data to various destinations and control the flow of microprogram execution and microcode subroutines may be supported. It will be understood that the invention is not limited to the use of any particular form of control processor for performing the sorting operations as hereinafter described in detail and as depicted in part in FIGS. 18-22.
  • FIG. 18 illustrates the organization of intermediate storage which is important to an understanding of the operation of the sort sequencer and FIGS. 19-22 illustrate sort sequencer modifications which are important in achieving sort stability.
  • the algorithm used by the sort accelerator 10 requires intermediate storage which is organized into regions of different sizes.
  • the smallest region is located at the base of the intermediate storage area and is used to hold up to N strings of N records.
  • the second region is located at the end of the first region and is used to hold up to N strings of N ⁇ N records.
  • Each successive region is larger than its predecessor by a factor of N.
  • FIG. 18 provides an illustrative example of the organization which would apply if N were 4, rather than 16 as in the illustrated embodiment.
  • the first region, indicated by R0 stores four records
  • the second region, indicated by R1 stores four strings of sixteen records, i.e. it stores N ⁇ N records.
  • the third region, indicated by R2, with only one-fourth being shown stores four strings of sixty-four records, i.e. it stores N ⁇ N ⁇ N records.
  • the first region R0 stores sixteen strings of sixteen records (256 records)
  • the second region R1 stores 4096 records
  • the third region R2 stores 65,536 records.
  • Subsequent regions contain 16 times the number of records in the preceding region. Any number of additional regions may be provided depending upon the available memory.
  • the location and size of the intermediate storage is programmed by loading the base address and end address of storage in the register array at operation initialization. If insufficient storage is allocated for an operation, the control processor will suspend the operation and interrupt the host processor. The storage can be dynamically expended by increasing the end address.
  • an extended merge-sort alogrithm is embedded in the microprogram stored in the ROM 38, but any equivalent means for implementing the algorithm may be used.
  • the algorithm utilizes the N-way merge capabilities of the enhanced rebound sorter to reorder an unlimited number of records.
  • the algorithm consists of two phases called the input phase and the output phase. As will be discussed below, the sequence of sorts and merges which make up these phases is important for sort stability, memory efficiency, and performance.
  • the input phase consisting of sorts and merges, is in effect until all of the unsorted records that make up the input string have been processed.
  • Each sort operates on N unordered records, creating a sorted string of N records in region 0 of intermediate storage.
  • Each merge operates on the N sorted strings of a filled region, creating a single sorted string in the next region N times larger than each of the input strings.
  • the input phase begins by sorting the first N records from the input string into one string of N records in region 0. The next N records from the input string are then sorted into a second string of N records. This process continues until N strings of N records exist in region 0, at which point the sorting is suspended while they are merged into a single string of N ⁇ N records in region 1.
  • the input phase continues, performing a "merge-up" whenever possible, until the entire input string is processed.
  • the number of records in the input string is programmed into the register array when the operation is initialized.
  • the record count register is continually decremented as the input string is processed.
  • the output phase consists of a series of merges to reduce this number to N or less, and a final merge directly to the destination address.
  • the partial string consists of the total number of records being sorted module N (possibly 0) which exists because the total number of records need not be a multiple of N.
  • priority is assigned from highest region to lowest. Within one region, priority is lowest string to highest. This coincides with the order the strings were created.
  • the output phase proceeds in a very efficient and reliable manner.
  • a merge is performed which includes as many regions as possible with the restriction that the total number of strings involved is less than or equal to N. This merge creates a new partial string in the first nonempty region not included in the merge.
  • the later case represents a performance optimization. Because the size of strings in the last two regions are significant with respect to the size of the entire sort operation, the manipulation of these strings is minimized by merging just enough strings from the second highest region into a partial string at the highest region to leave exactly N strings. This is called the "optimal" merge. The N strings then participate in the final merge.
  • the string count of each region is maintained in one bank of the register array to allow indexing.
  • the length of the partial string is maintained in another register in the array.
  • the lengths of the other strings are known from their region numbers.
  • the output phase includes the partial string in every merge, and each of these merges creates a larger partial string in a higher region.
  • the partial string length register is continually updated to reflect the size of the new partial string.
  • the partial string also participates in the "optimal" merge.
  • the control processor needs to determine if enough storage has been allocated for the operation. To do this with minimal performance degradation, two registers in the register array 35 are used to maintain the highest active region number and the highest active address. At the beginning of each sort or merge within the extended merge-sort algorithm, the current region is compared to the highest active region. If it is less than the highest active region minus 1, then the sort or merge continues normally; otherwise, the control processor is about to create a new string in the highest region and needs to verify that enough storage has been allocated to proceed.
  • the verification begins by using the highest active address register to determine if there is room for another full string in the next region. If there is, then the extended merge-sort proceeds normally; otherwise, the control processor determines how many strings from the current region will fit into a partial string at the next region, and uses that information to determine how many additional records could be processed if these strings were merged. If this number is less than the number of records remaining in the input string, then the control processor interrupts the host processor indicating that more storage must be allocated to finish the operation.
  • the control processor merges up just enough strings from this level such that the number of remaining strings in this level plus the number of additional strings in this level which will be formed from future input plus the number of strings in the next level is identically N.
  • a flag is set which will indicate to the output phase that an "optimal" merge has been performed, and the length of this special partial string is saved in a register in the register array. This register is not the same as the partial string length register used by the output phase.
  • the control processor merges as many strings as possible into the special partial string at the next level, sets the merge flag, saves the length of the special partial string, and continues creating strings at level 0. If the LOCK bit is not set, then the host processor is interrupted allowing it to set the END bit, set the LOCK bit, or increase the storage end address.
  • the "optimal" merge guarantees that the performance of the extended merge-sort is optimized.
  • the “compaction” merge occurs because the control processor does not know the total number of records in the operation and the host processor has given permission to lock the size of the intermediate storage. Once locked, it is illegal to increase the end of storage address. This merge optimizes the number of records that can be sorted with the given amount of storage.
  • the sorting process of the input phase as described hereinbefore requires that the sort accelerator read in N unordered records from the input string, sort them, and place the results in a new string in region 0 of intermediate storage.
  • the sort control may preferably exist within a single loop in the microcode.
  • the record count register is checked to see if the input string has been exhausted. Once it is, control is passed to the output phase. If there are more input records, the source address in the register array is passed to the Input Processor. This register, which was loaded with the address of the input string when the operation was initialized, is then incremented by the number of bytes per record. Another count register in the array is incremented by one, and the sort loop continues until the count reaches N.
  • the microcode for the merge control is preferably accelerated with the string selection circuit 39, using the merge algorithm as described hereinbefore.
  • the string selection logic is also described hereinabove.
  • the sort sequencer is also responsible for supplying the tag which the Input Processor inserts into each record.
  • a “stable" sort is one in which records with equal keys exit the sort in the same relative order as that in which they entered. To keep the design simple, and reduce the additional storage and time requirements, use is made of the tag byte already required by the N-way merge algorithm, to insure sort stability. This feature keeps the total record size relatively small and does not require extensive changes to the sort sequencer or the rebound sorter.
  • a tag byte is inserted into the input record stream following the key bytes, but preceding the remaining data bytes.
  • the high order 4 bits of the tag byte are set to the string number from which the record came (for a merge), or zero (for a sort).
  • the tag byte is always compared in the processing elements in "ascending" order. Placing the tag byte after the key, but before the remaining data allows the processing elements to decide which way to pass the remaining non-key bytes before those bytes are presented.
  • the string number will cause records from different strings with equal keys to exit the rebound sorter in the order in which they entered.
  • the "order" bit will cause records from the same string with equal keys to exit the rebound sorter in the order in which they entered.
  • strings of records are selected that entered the sorting process together. That is to say, there are no records not selected that entered the sorting process before some records that are selected, but after others that are selected (they could potentially belong in this merge).
  • a unique number is selected for each string which indicates the relative order in which the records in that string entered the sorting process with regard to the other strings within the same merge.
  • Each record of a string has the string number inserted into the high order bits of the tag byte. If the order of 2 records from different strings has not been resolved when their tag bytes are compared, the string number will insure that they exit from the merge in the same order that they entered.
  • the highest region contains the strings that entered the sort first, the second highest region second, and so on.
  • the lowest row contains the string that entered the sort first, the next row second, and so on.
  • string numbers are assigned sequentially starting with 0. Unless otherwise specified, within a group of rows the string numbers are assigned sequentially from the lowest to the highest row. As shown in FIG. 19, string numbers 0 through 15, in the column under "S" are in sequential order, string 15 being at the highest address "HA” and string 0 being at the lowest address "LA”.
  • the least significant bit of the tag byte (called the "order" bit) of each record entering the sorter is set to 1.
  • the order bit is used to preserve the order of records with the same string number.
  • the order bit is manipulated as follows (remember it enters the rebound sorter set to 1):
  • the first record enters the ERS and mixes with other records until the second record enters the ERS.
  • the first record In order to meet the second record, the first record must turn up, getting its order bit set to 1. Since the ERS is sorted, the second record will proceed straight down to meet the first one, keeping its order bit set. The 2 records will compare exactly equal, including the order bits, and thus will continue to pass vertically. In the absence of additional duplicate records, the duplicate records will continue up to pass larger records, or the top one will turn down (which clears its order bit) and meet the next duplicate record coming up (which still has its order bit set). This will cause them to preserve their current positions in the ERS until either passed by a larger record, another duplicate enters, or the ERS is flushed.
  • data passing through the sort accelerator is checked for corruption from the point where data enters the device until it exits.
  • order of output records is checked to ensure proper operation of the sort accelerator device and proper input record ordering for merges.
  • Byte parity is a scheme where 8 bits of data are represented by a 9 bit value. The extra bit carries a parity value, calculated at the source of the data. Parity checking is accomplished by re-calculating the parity value from the data and comparing the newly calculated value with the parity bit. An inequality indicates an error in the data.
  • the sort accelerator 10 is supplied with data and parity from the host system. As data enters, the parity is preferably checked. A error detected at this point is classified as a system parity error, and will be reported as such.
  • Parity protection of the data path from the system bus interface continues to the rebound sorter. As data enters the rebound sorter, data parity is again checked, with errors reported as internal data errors.
  • the rebound sorter On output, the rebound sorter generates data with parity. When this data reaches the system bus interface, its parity is checked, with discrepancies reported as an internal parity error. The parity value is regenerated at the system bus interface and passed with the data to the host system, allowing the host to check data parity as desired (parity is regenerated at the system bus interface to provide parity for data from the sort sequencer).
  • Checksum protection is a method of error detection where a checksum is calculated over multiple bytes. The checksum is later checked by re-calculating the checksum and comparing it to the previously calculated checksum.
  • a checksum is calculated using the following formula:
  • a record checksum is calculated.
  • the function (f) for this checksum uses the parity of the data byte, using a "PAR" block as hereinafter described in connection with FIG. 23, to select either the checksum or a bit reversed version of the checksum to be added modulo 256 to the data byte.
  • a "REVERSE BITS" operation is performed on a byte which swaps the most and least significant bits within the byte, the next most and least significant bits and so on, so that all bits have a different position.
  • This checksumming scheme can check for two types of errors in the sorter 18: single bit failure in the data path (including storage element failure) and the improperly controlled swapping of bytes between records (PE control failure).
  • the checksum operation is implemented using logic circuitry of FIG. 23 which develops a "EDCNEQ" signal on a line 270 which is connected to the output of a "NEQ" comparison circuit 272 which compares the output of two sections.
  • An upper section as shown in FIG. 23 includes an output latch 273 which is connected to the output of an adder circuit 274, one input of the adder circuit 274 being connected to the output of the latch 273.
  • the other input of the adder circuit 274 is connected to the output of an AND circuit 275 which has one input connected to the "EREC" line 160.
  • the other input of the AND circuit 275 is connected to the output of a latch 276 which is connected to the output of an adder circuit 278.
  • One input of the adder circuit 278 is connected to the output of a multiplexer 280 which has one input connected to the output of the latch 276 and which has a second input connected to a "REVERSE BITS" block 281.
  • Block 281 performs the aforementioned bit reversal of a byte.
  • the multiplexer 280 is controlled by a "PAR" block 282, the input of which is connected to a "IBYTE” line 283 (bytes entering the rebound sorter) the "IBYTE” being also connected to a second input of the adder 278.
  • the second lower section of the logic is substantially identical to the first, corresponding elements being indicated by primed numbers. The difference is that the line 283' is an "OBYTE” line (bytes exiting the rebound sorter) whereas the line 283 is an "IBYTE" line.
  • the comparison logic is similar to the processing elements used in the sorter with the exception of the output.
  • the output of the comparison logic is a signal which indicates that a decreasing (increasing for "descending"60 sorts) sequence of records has been detected (a sort order error).
  • the sort order checker is also used to provide a "tie" bit indicating that a record in the output string has the same key as the following record.
  • the "tie” bit is the least significant bit of the tag byte on output from the rebound sorter.
  • User software can utilize this "tie” bit to perform sorts on key fields which exceed the capacity of the accelerator 10, as described in a subsequent section, or to aid in post-sort processing of duplicate records.
  • the sort Accelerator usually strips the tag byte from records before outputting them, but it can be instructed to leave the tag bytes in the records if the user wishes.
  • the sort order checking operations are implemented by logic circuitry shown in FIGS. 24-28.
  • Circuitry as shown in FIG. 27 includes an output line 286 on which the aforementioned "SRTERR" signal is developed.
  • Line 286 is connected to the output of a latch 288 which is coupled to the output of an AND gate 289 having one input connected to a "-RESET” line 290.
  • the other input of AND gate 289 is connected to the other input of an OR gate 282 which has one input connected to a line 286 and a second input connected to an AND gate 294.
  • AND gate 294 has inputs connected to lines 295 and 297 which respectively receive "CHKING", and "EQUAL" signals.
  • the third input to AND gate 294 is connected to a multiplexer.
  • One input of the multiplexer is "A ⁇ B", the other is "A>B".
  • the multiplexer is controlled by an OR gate whose inputs are ASCENDING and ETAG.
  • the circuit of the sort order checker further includes logic circuitry as shown in FIGS. 24, 25, 26 and 28.
  • One input of the comparator 298 is connected directly to a "UR0" line 302 while the other input of comparator 298 is connected to the line 302 through two cascaded IERS elements 303 and 304. These form the aforementioned special storage element which holds one record block.
  • the "CHKING" signal on line 295 is developed by a latch 306 which is connected to the output of an AND gate 308 having one input to which a "-NEWSTREAMOUT” signal is applied on a line 309, the "-NEWSTREAMOUT” signal being the inverse of the "NEWSTREAMOUT” signal developed by the pipeline circuitry shown in FIG. 11.
  • a second input of the gate 308 is connected to the output of an OR gate 310 having one input connected to the "EREC" line 160 and having a second input connected to the line 295.
  • the "EQUAL" signal on line 297 is developed at the output of a latch 312 which is connected to the output of an OR gate 314 having one input connected to the "EREC” line 160 and having a second input connected to the output of an AND gate 316.
  • FIG. 28 shows the circuitry used to develop the "TIED" signal which is developed on a line 318 connected to the output of a latch circuit 319 which is connected to the output of an AND gate 320 having inputs connected to the "CHKING" line 295, the "ETAG” line 174 from the circuit of FIG. 10 and the “EQUAL” line 297.
  • the accelerator 10 may be used even when the available workspace memory is not enough to sort records in a user's file. Algorithms are provided for dealing with two very common cases. The first case is when not enough workspace memory is available to sort the total number of records. The second case is when the record and/or key length exceeds the maximum allowed by the sort accelerator.
  • the user uses the sort accelerator 10 to sort as many records as will fit in the available workspace. It is not necessary to compute this number. Records are simply fed to the sort accelerator until an indication is provided that it is full. Each of these sorted strings of records must then be placed in storage available to the user (they may be written to disk, for example). When all of the records have been sorted, or 16 sorted strings have been created, the user then merges the sorted strings using the sort accelerator 10 as a merger. This can be done with buffered input and output, and thus requires that very little memory be available to the sort accelerator. The sorted string resulting from this merge can again be placed in storage. This algorithm can be used iteratively, limited only by the availability of additional record storage.
  • the first, and simplest, case is when the total record is too large, but the key plus a pointer to the original record will fit.
  • new records are input to the sort accelerator made up of the original key followed by bytes containing a pointer to the original record.
  • the user merely processes the sorted records, using the pointer to access the original record contents.
  • the second case occurs when the record is too large, and the key plus a pointer to the original record will not fit. In this case, the user again inputs new records, but this time the user only puts in as many key bytes as will fit, leaving room for the pointer at the end.
  • a request may preferably be that the tag byte be output so the user can examine the "tie" bits.
  • the truncated keys thus presented will be unique enough to determine the sorted order of the records. It is merely necessary to examine the "tie" bits to locate groups of records with duplicate truncated keys. This group of records can then be further sorted by using as the new key, as many of the remaining key bytes from the original record as will fit. This process may be iterated on all groups of records with duplicate truncated keys until there are no duplicate keys reported, or all key bytes have been processed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Radar Systems Or Details Thereof (AREA)
  • Processing Of Solid Wastes (AREA)
  • Advance Control (AREA)
  • Complex Calculations (AREA)
  • Cooling, Air Intake And Gas Exhaust, And Fuel Tank Arrangements In Propulsion Units (AREA)
US07/374,433 1989-06-30 1989-06-30 Sort accelerator with rebound sorter repeatedly merging sorted strings Expired - Lifetime US5142687A (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US07/374,433 US5142687A (en) 1989-06-30 1989-06-30 Sort accelerator with rebound sorter repeatedly merging sorted strings
CA002017900A CA2017900A1 (en) 1989-06-30 1990-05-30 Sort accelerator using rebound sorter as merger
EP90305934A EP0405759B1 (de) 1989-06-30 1990-05-31 Sortierbeschleuniger, der einen in beide Richtungen arbeitenden Sortierer als Mischer verwendet
AT90305934T ATE174696T1 (de) 1989-06-30 1990-05-31 Sortierbeschleuniger, der einen in beide richtungen arbeitenden sortierer als mischer verwendet
DE69032828T DE69032828T2 (de) 1989-06-30 1990-05-31 Sortierbeschleuniger, der einen in beide Richtungen arbeitenden Sortierer als Mischer verwendet
JP2172514A JPH0776910B2 (ja) 1989-06-30 1990-06-29 リバウンド分類装置を併合装置として使用する分類加速装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US07/374,433 US5142687A (en) 1989-06-30 1989-06-30 Sort accelerator with rebound sorter repeatedly merging sorted strings

Publications (1)

Publication Number Publication Date
US5142687A true US5142687A (en) 1992-08-25

Family

ID=23476796

Family Applications (1)

Application Number Title Priority Date Filing Date
US07/374,433 Expired - Lifetime US5142687A (en) 1989-06-30 1989-06-30 Sort accelerator with rebound sorter repeatedly merging sorted strings

Country Status (6)

Country Link
US (1) US5142687A (de)
EP (1) EP0405759B1 (de)
JP (1) JPH0776910B2 (de)
AT (1) ATE174696T1 (de)
CA (1) CA2017900A1 (de)
DE (1) DE69032828T2 (de)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5206947A (en) * 1989-06-30 1993-04-27 Digital Equipment Corporation Stable sorting for a sort accelerator
US5257384A (en) * 1991-09-09 1993-10-26 Compaq Computer Corporation Asynchronous protocol for computer system manager
US5274777A (en) * 1990-04-03 1993-12-28 Fuji Xerox Co., Ltd. Digital data processor executing a conditional instruction within a single machine cycle
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted
US5396622A (en) * 1991-12-23 1995-03-07 International Business Machines Corporation Efficient radix sorting system employing a dynamic branch table
US5497486A (en) * 1994-03-15 1996-03-05 Salvatore J. Stolfo Method of merging large databases in parallel
US5579514A (en) * 1993-10-22 1996-11-26 International Business Machines Corporation Methodology for increasing the average run length produced by replacement selection strategy in a system consisting of multiple, independent memory buffers
US5857186A (en) * 1994-03-07 1999-01-05 Nippon Steel Corporation Parallel merge sorting apparatus with an accelerated section
US6311184B1 (en) * 1995-10-06 2001-10-30 International Business Machines Corporation Sort and merge functions with input and output procedures
US20040122837A1 (en) * 2002-12-18 2004-06-24 International Business Machines Corporation Method and system for compressing varying-length columns during index high key generation
US20050169688A1 (en) * 2001-04-30 2005-08-04 Microsoft Corporation Keyboard with improved function and editing sections
US20100042624A1 (en) * 2008-08-18 2010-02-18 International Business Machines Corporation Method for sorting data
US8843502B2 (en) 2011-06-24 2014-09-23 Microsoft Corporation Sorting a dataset of incrementally received data
US8892612B2 (en) 2011-03-30 2014-11-18 Hewlett-Packard Development Company, L.P. Exploitation of correlation between original and desired data sequences during run generation
WO2019170961A1 (en) * 2018-03-05 2019-09-12 University Of Helsinki Device, system and method for parallel data sorting

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6947687B2 (en) 2002-06-07 2005-09-20 Canon Kabushiki Kaisha Cartridge having locking portion for locking cartridge with an image forming apparatus and releasing portion to release the locking portion, and image forming apparatus having such a cartridge
CN106843803B (zh) * 2016-12-27 2019-04-23 南京大学 一种基于归并树的全排序加速器及应用

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3587057A (en) * 1969-06-04 1971-06-22 Philip N Armstrong Data sorting system
US3997880A (en) * 1975-03-07 1976-12-14 International Business Machines Corporation Apparatus and machine implementable method for the dynamic rearrangement of plural bit equal-length records
US4031520A (en) * 1975-12-22 1977-06-21 The Singer Company Multistage sorter having pushdown stacks with concurrent access to interstage buffer memories for arranging an input list into numerical order
US4078260A (en) * 1976-05-12 1978-03-07 International Business Machines Corporation Apparatus for transposition sorting of equal length records in overlap relation with record loading and extraction
US4090249A (en) * 1976-11-26 1978-05-16 International Business Machines Corporation Apparatus for sorting records in overlap relation with record loading and extraction
US4110837A (en) * 1976-12-30 1978-08-29 International Business Machines Corporation Apparatus for the sorting of records overlapped with loading and unloading of records into a storage apparatus
US4210961A (en) * 1971-10-08 1980-07-01 Whitlow Computer Services, Inc. Sorting system
US4303989A (en) * 1979-07-17 1981-12-01 The Singer Company Digital data sorter external to a computer
US4425617A (en) * 1981-03-23 1984-01-10 Rca Corporation High-speed data sorter
US4464732A (en) * 1982-03-19 1984-08-07 Honeywell Inc. Prioritized sorting system
US4499555A (en) * 1982-05-06 1985-02-12 At&T Bell Laboratories Sorting technique
US4514826A (en) * 1981-05-18 1985-04-30 Tokyo Shibaura Denki Kabushiki Kaisha Relational algebra engine
US4520456A (en) * 1983-02-18 1985-05-28 International Business Machines Corporation Dual reciprocating pipelined sorter
US4604726A (en) * 1983-04-18 1986-08-05 Raytheon Company Sorting apparatus

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0666050B2 (ja) * 1984-08-03 1994-08-24 日本電信電話株式会社 ソート処理方法
EP0244958A3 (de) * 1986-04-09 1989-10-25 Howard B. Demuth Sortierverfahren und -gerät mit Multispaltenmischer

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3587057A (en) * 1969-06-04 1971-06-22 Philip N Armstrong Data sorting system
US4210961A (en) * 1971-10-08 1980-07-01 Whitlow Computer Services, Inc. Sorting system
US4210961B1 (en) * 1971-10-08 1996-10-01 Syncsort Inc Sorting system
US3997880A (en) * 1975-03-07 1976-12-14 International Business Machines Corporation Apparatus and machine implementable method for the dynamic rearrangement of plural bit equal-length records
US4031520A (en) * 1975-12-22 1977-06-21 The Singer Company Multistage sorter having pushdown stacks with concurrent access to interstage buffer memories for arranging an input list into numerical order
US4078260A (en) * 1976-05-12 1978-03-07 International Business Machines Corporation Apparatus for transposition sorting of equal length records in overlap relation with record loading and extraction
US4090249A (en) * 1976-11-26 1978-05-16 International Business Machines Corporation Apparatus for sorting records in overlap relation with record loading and extraction
US4110837A (en) * 1976-12-30 1978-08-29 International Business Machines Corporation Apparatus for the sorting of records overlapped with loading and unloading of records into a storage apparatus
US4303989A (en) * 1979-07-17 1981-12-01 The Singer Company Digital data sorter external to a computer
US4425617A (en) * 1981-03-23 1984-01-10 Rca Corporation High-speed data sorter
US4514826A (en) * 1981-05-18 1985-04-30 Tokyo Shibaura Denki Kabushiki Kaisha Relational algebra engine
US4464732A (en) * 1982-03-19 1984-08-07 Honeywell Inc. Prioritized sorting system
US4499555A (en) * 1982-05-06 1985-02-12 At&T Bell Laboratories Sorting technique
US4520456A (en) * 1983-02-18 1985-05-28 International Business Machines Corporation Dual reciprocating pipelined sorter
US4604726A (en) * 1983-04-18 1986-08-05 Raytheon Company Sorting apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"The Rebound Sorter: An efficient sort engine for large files," International Conference on Databases, 4th--Proceedings, West Berlin, Germany, 1978, pp. 312-318. IEEE Computer Society, Long Beach, Calif. Document #78 CH 1389-6 C.
The Rebound Sorter: An efficient sort engine for large files, International Conference on Databases, 4th Proceedings, West Berlin, Germany, 1978, pp. 312 318. IEEE Computer Society, Long Beach, Calif. Document 78 CH 1389 6 C. *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5349684A (en) * 1989-06-30 1994-09-20 Digital Equipment Corporation Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted
US5206947A (en) * 1989-06-30 1993-04-27 Digital Equipment Corporation Stable sorting for a sort accelerator
US5274777A (en) * 1990-04-03 1993-12-28 Fuji Xerox Co., Ltd. Digital data processor executing a conditional instruction within a single machine cycle
US5257384A (en) * 1991-09-09 1993-10-26 Compaq Computer Corporation Asynchronous protocol for computer system manager
US5396622A (en) * 1991-12-23 1995-03-07 International Business Machines Corporation Efficient radix sorting system employing a dynamic branch table
US5579514A (en) * 1993-10-22 1996-11-26 International Business Machines Corporation Methodology for increasing the average run length produced by replacement selection strategy in a system consisting of multiple, independent memory buffers
US5857186A (en) * 1994-03-07 1999-01-05 Nippon Steel Corporation Parallel merge sorting apparatus with an accelerated section
US5717915A (en) * 1994-03-15 1998-02-10 Stolfo; Salvatore J. Method of merging large databases in parallel
US5497486A (en) * 1994-03-15 1996-03-05 Salvatore J. Stolfo Method of merging large databases in parallel
US6311184B1 (en) * 1995-10-06 2001-10-30 International Business Machines Corporation Sort and merge functions with input and output procedures
US20050169688A1 (en) * 2001-04-30 2005-08-04 Microsoft Corporation Keyboard with improved function and editing sections
US20040122837A1 (en) * 2002-12-18 2004-06-24 International Business Machines Corporation Method and system for compressing varying-length columns during index high key generation
US7039646B2 (en) * 2002-12-18 2006-05-02 International Business Machines Corporation Method and system for compressing varying-length columns during index high key generation
US20100042624A1 (en) * 2008-08-18 2010-02-18 International Business Machines Corporation Method for sorting data
US10089379B2 (en) * 2008-08-18 2018-10-02 International Business Machines Corporation Method for sorting data
US8892612B2 (en) 2011-03-30 2014-11-18 Hewlett-Packard Development Company, L.P. Exploitation of correlation between original and desired data sequences during run generation
US8843502B2 (en) 2011-06-24 2014-09-23 Microsoft Corporation Sorting a dataset of incrementally received data
WO2019170961A1 (en) * 2018-03-05 2019-09-12 University Of Helsinki Device, system and method for parallel data sorting

Also Published As

Publication number Publication date
JPH0776910B2 (ja) 1995-08-16
DE69032828D1 (de) 1999-01-28
JPH03116323A (ja) 1991-05-17
ATE174696T1 (de) 1999-01-15
CA2017900A1 (en) 1990-12-31
EP0405759B1 (de) 1998-12-16
DE69032828T2 (de) 1999-07-01
EP0405759A3 (en) 1992-09-02
EP0405759A2 (de) 1991-01-02

Similar Documents

Publication Publication Date Title
US5349684A (en) Sort and merge system using tags associated with the current records being sorted to lookahead and determine the next record to be sorted
US5111465A (en) Data integrity features for a sort accelerator
US5142687A (en) Sort accelerator with rebound sorter repeatedly merging sorted strings
US4210961A (en) Sorting system
US5185886A (en) Multiple record group rebound sorter
JP2668438B2 (ja) データ検索装置
EP0127753B1 (de) Verfahren zur Durchführung von Distributionssortierung
US5408626A (en) One clock address pipelining in segmentation unit
US5287494A (en) Sorting/merging tree for determining a next tournament champion in each cycle by simultaneously comparing records in a path of the previous tournament champion
US4011547A (en) Data processor for pattern recognition and the like
US5060143A (en) System for string searching including parallel comparison of candidate data block-by-block
US4991134A (en) Concurrent sorting apparatus and method using FIFO stacks
KR100346515B1 (ko) 수퍼파이프라인된수퍼스칼라프로세서를위한임시파이프라인레지스터파일
GB1563482A (en) Multipass sorter for arranging an input list into numerical order
EP0823085A2 (de) Verfahren und apparat für eine erhöhte genauigkeit bei der verzweigungsvorhersage in einem superskalaren mirkroprozessor
US5206947A (en) Stable sorting for a sort accelerator
US3959777A (en) Data processor for pattern recognition and the like
JPS6142031A (ja) ソ−ト処理装置
US6240540B1 (en) Cyclic redundancy check in a computer system
US6513053B1 (en) Data processing circuit and method for determining the first and subsequent occurences of a predetermined value in a sequence of data bits
Lee ALTEP—A cellular processor for high-speed pattern matching
US5377335A (en) Multiple alternate path pipelined microsequencer and method for controlling a computer
JPH03129520A (ja) 分類加速装置の制御
EP0468402A2 (de) System und Verfahren zum Abruf von Zeichenreihen
Katz Optimizing bit-time computer simulation

Legal Events

Date Code Title Description
AS Assignment

Owner name: DIGITAL EQUIPMENT CORPORATION, 146 MAIN STREET, MA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:LARY, RICHARD F.;REEL/FRAME:005099/0266

Effective date: 19890629

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

AS Assignment

Owner name: COMPAQ INFORMATION TECHNOLOGIES GROUP, L.P., TEXAS

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DIGITAL EQUIPMENT CORPORATION;COMPAQ COMPUTER CORPORATION;REEL/FRAME:012447/0903;SIGNING DATES FROM 19991209 TO 20010620

AS Assignment

Owner name: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P., TEXAS

Free format text: CHANGE OF NAME;ASSIGNOR:COMPAQ INFORMATION TECHNOLOGIES GROUP, LP;REEL/FRAME:015000/0305

Effective date: 20021001

FPAY Fee payment

Year of fee payment: 12