WO2001040996A1 - Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire - Google Patents

Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire Download PDF

Info

Publication number
WO2001040996A1
WO2001040996A1 PCT/US1999/028430 US9928430W WO0140996A1 WO 2001040996 A1 WO2001040996 A1 WO 2001040996A1 US 9928430 W US9928430 W US 9928430W WO 0140996 A1 WO0140996 A1 WO 0140996A1
Authority
WO
WIPO (PCT)
Prior art keywords
key values
nodes
node
internal
leaf
Prior art date
Application number
PCT/US1999/028430
Other languages
English (en)
Inventor
Kenneth A. Ross
Jun Rao
Original Assignee
The Trustees Of Columbia University In The City Of New York
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Trustees Of Columbia University In The City Of New York filed Critical The Trustees Of Columbia University In The City Of New York
Priority to PCT/US1999/028430 priority Critical patent/WO2001040996A1/fr
Priority to US09/600,266 priority patent/US6711562B1/en
Publication of WO2001040996A1 publication Critical patent/WO2001040996A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees

Definitions

  • the present invention relates to indexing techniques for searching computer system main memories, and more specifically, to indexing structures relating to searching of databases and arrays.
  • index structures are a significant factor in main memory database system performance, and can be used to reduce overall computation time without consuming very much additional memory space. In sufficiently large memory systems, most indexes can be memory resident.
  • OLTP Database Kernel Advanced Data Warehousing Techniques Proceedings, IEEE IntT Conf. On Data Eng., 1997, the contents of which are incorporated herein by reference.
  • the performance of a typical OLAP system can be enhanced by improving query performance, even if at the expense of update performance.
  • Certain commercial systems designed for such purposes include Sybase IQ, which is described in Sybase Corporation, Sybase IQ 11.2.1, 1997, the contents of which is incorporated herein by reference.
  • OLAP system performance can be improved in this way because typical OLAP application workloads are query-intensive, but require infrequent batch updates. For example, in applications involving census data, large quantities of data is collected and updated periodically, but then remain static for relatively long periods of time.
  • a typical university's data warehouse containing student records may be updated daily.
  • Certain systems in which data remain static for relatively long periods of time presently may be on the order of several gigabytes in size, and therefore can be stored within present day main memory systems. Because updates in such systems are typically relatively infrequent and are batched, the performance associated with incremental updates of indexes in these systems may not be critical. In fact, it may even be efficient to reconstruct indexes entirely from scratch after relatively infrequent batch updates, if such an approach leads to improved query performance.
  • index structures are the two most important criteria in selecting particular index structures. Because memory space normally is a critical factor in constructing main memory databases, there typically is limited space available for precomputed structures such as indexes. In addition, given a particular set of memory space constraints, the objective is to minimize the time required to perform index lookups. In main memory databases, an important factor influencing the speed of database operations is the degree of locality for the data references for the particular algorithm being run. Superior data locality leads to fewer cache misses, and therefore improved performance.
  • cache memories normally are fast static random access memories (RAM) that improve computer system performance by storing data likely to be accessed by the computer system.
  • RAM random access memory
  • Memory references which can be satisfied by the cache are known as hits, and proceed at processor speed.
  • Those memory references which are not found in the cache are known as misses, and result in a cache miss time penalty in the form of a fetch of the corresponding cache block from main memory.
  • Caches are normally characterized by their capacity, block size and associativity. Capacity refers to the cache's overall memory size; block size is the size of the basic memory unit which is transferred between cache and main memory; and associativity refers to the number of locations in the cache which are potential destinations for any single main memory address.
  • Typical prior art cache optimization techniques for data processing applications include clustering, compression and coloring as set forth in Trishul M. chilimbi, James R. Larus, Mark D. Hill, Improving Pointer-Based Codes Through Cache-Conscious Data Placement, Technical Report 98, University of Wisconsin- Madison, Computer Science Department, 1998, the contents of which is incorporated herein by reference.
  • Clustering techniques attempt to pack into a cache block data structure elements which are likely to be accessed successively.
  • Compression attempts to remove irrelevant data from the cache and thus increase cache block utilization by enabling more usable data structure elements to be placed in the cache blocks. Compression includes key compression, structure encodings such as pointer elimination, and fluff extraction.
  • Coloring techniques map contemporaneously- accessed data structure elements onto non-conflicting regions of the cache. Caches inherently have such conflicting regions because they have finite levels of associativity, which results in only a limited number of concurrently accessed data elements being able to be mapped to the same cache line without generating a conflict. Certain prior art has proposed improvements in cache performance using these techniques. For example, in Michael E. Wolf, et al., A Data Locality Optimizing Algorithm, SIGPLAN Notices, 26(6):30-44, 1991, cache reference locality is exploited in an effort to improve the performance of matrix multiplication.
  • Each type of database indexing structure known in the prior art has inherent characteristics which impact cache performance. For example, in the case of array binary searches, many accesses to elements of the sorted array may result in a cache miss. Although misses do not normally occur for the first references because of temporal locality over many searches, and also do not normally occur for the last references because of spatial locality, if many records from the array fit inside a single cache line, misses nevertheless occur for many of the intervening accesses when the array is substantially larger than the cache. In the worst case scenario, the number of cache misses is of the order of the number of key comparisons.
  • T-Trees have been proposed as an improved database index structure, but also exhibit cache behavior similar to that exhibited by binary searching.
  • T-trees are balanced binary trees having many elements in a node; these elements contain adjacent key values and are stored in order.
  • the objective of T-Trees is to balance the memory space overhead and the search time, T-Trees nevertheless do not optimize cache behavior or performance.
  • T-Trees may initially appear to be cache conscious in that they place a greater number of keys in each node, for most T- Tree nodes only one or two end keys are actually used for purposes of comparison. As a result, the utilization of each node is relatively low. For this reason, the number of key comparisons remains the same as in a binary search, and cache behavior and performance is not improved.
  • T-Trees are addressed in additional detail in Tobin J. Lehman and Michael J. Carey, Query Processing in Main Memory Database Management Systems, Proceedings of the ACM SIGMOD Conference, pages 239- 250, 1986, and Tobin J. Lehman and Michael J. Carey, A Study of Index Structures for Main Memory Database Management Systems, Proceedings of the 12 th VLDB Conference, pages 294-303, 1986, the contents of which are incorporated herein by reference.
  • B+-Trees and enhanced B+-Trees provide improvements in cache behavior and performance as compared to the cache performance associated with T-Trees.
  • each internal node has stored therein internal node keys and child pointers. Record pointers, however, are stored only in leaf nodes. Multiple keys can be used to search within a node. If each node is of a size that can fit in a cache line, a single cache load can provide data capable of satisfying more than one comparison. This results in improved utilization rates for each cache line.
  • enhanced B+-Trees can be used to utilize all of the locations in a B+-tree node, and trees can be rebuilt whenever batch updates arrive.
  • node size can be designed to be exactly the same size as the cache lines, and in addition, the nodes can be aligned. Nevertheless, such enhanced B+-Trees must store child pointers within each node, which for any given node size permits only half of the node space to be used to store keys. This results in less than optimal cache behavior and performance.
  • B+-Trees is also used to describe enhanced B+ trees.
  • Hash indexes can also greatly benefit from improved cache optimization.
  • One of the most common hashing methods is known as chained bucket hashing, which is discussed in additional detail in Donald Ervin Knuth, Sorting and Searching, vol. 3 of The Art of Computer Programming, Addison- Wesley, Reading, Massachusetts, USA, 1973, the contents of which art is incorporated herein by reference.
  • Certain efforts have been made to improve cache performance in such hash indexing schemes. For example, in Goetz Graefe, et al., Hash Joins and Hash Teams in Microsoft SQL Server, Proceedings of the 24th VLDB Conference, pages 86-97, 1998, the contents of which is incorporated herein by reference, the cache line size was used as the bucket size.
  • hash indexes expedite fast searching only if the length of each bucket chain is relatively small.
  • Such an arrangement requires a relatively large directory size, which in turn requires a relatively large amount of main memory space.
  • skewed data can adversely affect hash index performance, unless the hash function is relatively sophisticated which in turn increases computation time.
  • hash indexes normally do not preserve any type of order, which in turn detracts from cache performance.
  • an ordered list In order to provide ordered access using hash indexes, an ordered list must be maintained in addition to the hash indexes.
  • CSS-tree index structures for providing improved searching of sorted arrays. It is another object of the present invention to provide CSS-tree index structures which provide improved search and lookup performance as compared with conventional searching schemes.
  • a search tree index system and method for locating a particular key value stored in a sorted array of key values includes a computer memory for storing a search tree structure having a plurality of leaf nodes, wherein each leaf node contains multiple key values and the leaf nodes can reference the key values stored in the sorted array according to an offset value.
  • the search tree structure stored in computer memory also has a plurality of internal nodes, wherein each internal node contains multiple key values and has associated therewith multiple children nodes. The children nodes can be referenced by the internal node associated therewith according to another offset value.
  • the children nodes associated with each internal node can be either internal nodes or leaf nodes.
  • the system also includes a computer processor with a cache memory characterized by a cache size, a cache line size and an associativity level.
  • the computer processor is coupled to the computer memory to provide computational access to the sorted array of key values, the leaf nodes and the internal nodes.
  • the computer processor determines for the key value being searched the offset value necessary to reference the children nodes from the internal nodes, and the offset value necessary to reference the key value from the leaf nodes and to locate the key value in the sorted array of key values.
  • the quantity of internal nodes and the quantity of leaf nodes stored in the memory correspond to the characteristics of the cache memory.
  • Figure 1 is a block diagram illustrative of an exemplary computer system capable of implementing searching utilizing the CSS-tree index structures of the present invention.
  • Figure 2a is a diagram illustrating an exemplary CSS-tree index structure of the present invention.
  • Figure 2b is a diagram illustrating an exemplary CSS-tree index structure of the present invention.
  • Figure 3 is a schematic diagram illustrating an exemplary node structure of an exemplary CSS-tree index structure of the present invention.
  • Figure 4a is a graph representing exemplary data comparing the memory space (indirect) requirements for various search methods, including the full CSS-tree and level CSS-tree search methods of the present invention.
  • Figure 4b is a graph representing exemplary data comparing the memory space (direct) requirements of various search methods, including the full CSS-tree and level CSS-tree search methods of the present invention.
  • Figure 5 is a graph representing exemplary data of the time required to construct an exemplary full CSS-tree and an exemplary level CSS-tree as a function of the size of the sorted array to be searched.
  • Figure 6a is a graph representing exemplary data comparing the performance characteristics of various search methods, including the full and level CSS-tree search methods of the present invention, wherein the node size is 32 bytes.
  • Figure 6b is a graph representing exemplary data comparing the performance characteristics of various search methods, including the full and level CSS-tree search methods of the present invention, wherein the node size is 64 bytes.
  • Figure 7a is a graph representing exemplary data comparing the first-level cache miss performance characteristics of various search methods, including the full and level CSS-tree search methods of the present invention.
  • Figure 7b is a graph representing exemplary data comparing the second-level cache miss performance characteristics of various search methods, including the full and level CSS-tree search methods of the present invention.
  • Figure 8 is a graph representing exemplary data comparing the performance characteristics of various search methods, including the full and level CSS-tree search methods of the present invention, for an embodiment where the sorted array is fixed in size and the node size is varied.
  • Figure 9 is an exemplary process flow diagram for constructing search trees for the full and level CSS-tree search methods of the present invention.
  • Figure 10 is an exemplary process flow diagram for searching for a data record using the full and level CSS-tree search methods of the present invention.
  • the cache sensitive search tree (CSS-tree) indexing structures (also referred to herein simply as CSS-trees) of the present invention provide improved search and lookup performance as compared with conventional searching methods such as binary searching. This is accomplished by considering parameters such as reference locality and cache behavior, without using substantial additional amounts of memory to store the index structure.
  • a preferred embodiment of the CSS-trees of the present invention operates by storing a directory (index) structure on top of a sorted array of elements (preferably a sorted array of keys, which are well known in the art to represent fields by which data records can be searched). This directory structure preferably is stored in an array.
  • the CSS-tree index structure of the present invention can be used to efficiently search for and locate keys stored in the sorted array. Once the desired key has been located, any desired data record corresponding to that key can be easily located using any conventional method, and the search is complete.
  • Nodes in the CSS-tree index directory structure preferably have sizes selected to correspond to the cache-line size in the particular computer system utilizing the CSS-tree searching of the present invention.
  • internal child node pointers preferably are not stored in the CSS-trees of the present invention; rather, child nodes preferably are located by performing arithmetic operations based on array offsets.
  • the keys which reside in the sorted array of keys being searched are also preferably located by performing arithmetic operations based on array offsets. Because of these characteristics, the CSS-trees of the present invention are cache conscious, and therefore provide superior cache performance. Further, the CSS-trees of the present invention preferably are organized so that traversing the tree yields superior data reference locality, and therefore relatively few cache misses.
  • Fig. 1 depicts a typical prior art computer system which can perform searching operations on databases utilizing the CSS-tree indexing structures of the present invention.
  • the depicted computer system includes a CPU 100 connected to cache memory 102 via data bus 104.
  • Cache memory 102 in turn is connected to memory system 106 via data bus 108.
  • Memory system 106 may for example be a main memory system, or alternatively, another level of cache memory connected to and in combination with a main memory system.
  • the main memory portion of memory system 106 typically is connected to a mass data storage system 110 via data bus 112.
  • I/O (input/output) devices normally included in a typical computer system but not shown in Fig. 1 are I/O (input/output) devices.
  • the databases normally are stored in mass data storage system 110.
  • such databases may also be stored either in whole or in part in the main memory system such as is contained in memory system 106 for example, as well as cache memory 102.
  • the CSS-trees of the present invention may be configured in many ways, the characteristics of the present invention can be illustrated using two exemplary preferred embodiments of the present invention, namely, full CSS-trees and level CSS-trees.
  • full CSS-trees the number of child nodes for each node equals m + 1 , where m is defined to be the number of entries in each node within the CSS-tree indexing structure.
  • level CSS-trees the number of child nodes for each node equals m, and one key entry location in each node is not utilized.
  • full CSS-trees are not as deep as level CSS-trees.
  • level CSS-trees can be constructed with nodes requiring on average fewer comparisons than are required by nodes in full CSS-trees, in the special case where m is an integer power of 2.
  • Figs. 2a and 2b depict a representation of an exemplary full CSS-tree wherein m equals 4, and wherein there are therefore 5 child nodes for each node.
  • the depicted CSS-tree comprises internal nodes 200 - 207 (collectively, internal nodes 213) and leaf nodes 208 - 212 (collectively, leaf nodes 214).
  • the numbers appearing inside the boxes representing internal nodes 200 - 207 and leaf nodes 208 - 212 denote the number(s) assigned to the internal node(s) or leaf nodes represented by each box.
  • each internal node comprising group of internal nodes 213 contains key values representing the boundaries in the ranges of the key values contained in that internal node's child nodes.
  • the nodes of a CSS-tree such as that depicted in Fig. 2a can be stored in an array 215 as depicted in Fig. 2b.
  • Fig. 2b also depicts a representation of the sorted array 216 of key values to be searched.
  • the key values stored in leaf nodes 214 are mapped onto the key values stored in sorted array 216, as represented by mapping arrows 217.
  • This mapping is performed using a series of calculated offsets, rather than stored explicit pointers, as will be discussed in additional detail below.
  • the CSS-trees of the present invention provide better cache performance and therefore can be traversed more efficiently than trees used in conventional search methods, such as binary searching
  • m is selected so that a node fits in a cache line. In such an arrangement, all searching within a single node can be performed with at most one cache miss. Accordingly, there occurs no more than log m+1 n cache misses for any given search, as compared to as many as log 2 n cache misses if binary searching is being used.
  • a node occupies two cache lines, on average only one cache miss will occur in half of the searches within a single node, whereas two misses will occur in the other half.
  • the code implementing the traversal within a single node preferably is hard coded, so that the calculations necessary to locate the next node can be performed more efficiently.
  • a specific example is presented.
  • the nodes and keys therein are each numbered from top to bottom and left to right, beginning with number 0.
  • the children of that node are numbered from b(m + 1) + 1 to b(m + 1) + (m + 1).
  • the CSS-tree directory depicted in Fig. 2a there are m keys per node as discussed above. Therefore, any arbitrary key number i stored in the CSS-tree directory array
  • the offset of the lowest-numbered (i.e., m the first) key for each of the child nodes within the CSS-tree directory array 215 will be (
  • leaf nodes 214 are stored in a contiguous array 215 in key order.
  • This approach conflicts with the general approach called for by the natural order of a CSS-tree, which generally stores nodes from left to right within each level.
  • Use of this general approach would undesirably split the array 215, and would place the right half of the array (which appears at a higher tree level than does the left half of the array) before the left half of the array.
  • leaf nodes 208-210 i.e., the leaf nodes having node numbers 16-30
  • leaf nodes 211 - 212 i.e., the leaf nodes having node numbers 31 - 80
  • leaf nodes 211 - 212 i.e., the leaf nodes having node numbers 31 - 80
  • leaf nodes 208 - 210 i.e., the leaf nodes having node numbers 16 - 30. This is because it is desirable to maintain sorted array 216 in contiguous key order. Therefore, in order to determine the correct leaf nodes 214 when performing a search using a CSS-tree, the general CSS-tree search method must be modified as discussed below.
  • leaf nodes 214 i.e., one portion containing leaf nodes 208 - 210 and the other portion containing leaf nodes 211 - 212
  • This mapping preferably is performed by first determining y, which is defined to be the number of the key which represents the boundary point between these two portions of leaf nodes 214.
  • This value v also represents the value of the offset corresponding to the lowest-numbered key in the deepest leaf node level in the CSS-tree.
  • This leaf node is defined herein as "Mark,” and in the particular example depicted in Figs.
  • this is the node having node number 31 in group of nodes 211.
  • the offset x is first compared with y to determine which of the two portions of leaf nodes 214 contains the arbitrary key, and therefore which portion of the sorted array 216 should be searched. For example, if x > v, the desired key element can be located at position x - v from the beginning of sorted array 216. If on the other hand, if x ⁇ y, the desired key element can be located at position > - x from the end of sorted array 216. For example, in the example depicted in Fig.
  • the first key in the leaf nodes 214 having leaf node number 30 can be located at the first key in the node having node number 64 in sorted array 216.
  • These techniques are also applicable to sorted arrays 216 containing elements having sizes different from the key size. This is because the offsets into the array 215 of leaf nodes 214 are independent of the record size within the sorted array 216.
  • n N * m is defined to be the number of elements in sorted array 216, where N is the total number of leaf nodes 214, then the total number of internal nodes in a full CSS-tree is equal to: ( ffl + 1) - 1 _j (m + ⁇ ) - N i T e first leaf node in the bottom level is defined as m m
  • FIG. 2a An exemplary method of constructing a full CSS-tree, such as the exemplary full CSS-tree depicted in Fig. 2a, from a sorted array 216 is now presented.
  • the sorted array 216 is first divided logically into two portions as described above, and a mapping is then established between the leaf nodes 214 and the elements in the sorted array 216.
  • these two portions of the sorted array 216 are denoted by Roman numerals I and II, respectively.
  • the highest- numbered (i.e., the last) internal node comprising group of internal nodes 213 is then determined as described above.
  • each key entry in the node is then filled in with the highest key value contained in that node's subtree.
  • the value of the highest-numbered key contained in a node's subtree can be determined for each key entry by following that node's link in its rightmost branch down the tree levels until a leaf node in group of leaf nodes 214 is reached. Once the leaf node in group of leaf nodes 214 is reached, the highest- numbered key value contained in that leaf node is used as this value. This process is then repeated for each key location for all of the remaining internal nodes 213, preferably in descending node number order.
  • Input the sorted array (a), number of elements in the array (n).
  • Output the array storing the internal nodes of a full CSS-tree (b). last internal node number (LNode). index of the first entry of the leftmost leaf node in the bottom level in a CSS directory array. (MARK).
  • certain internal nodes 213, namely those which are ancestors of the rightmost (i.e., last) leaf node at the deepest level of the CSS-tree, may not always contain a full complement of keys. This may occur for example as a result of the particular size of the sorted array 216 being searched. In this case, any so-called dangling keys preferably are filled with the last element in the first portion of sorted array 216. This may result in certain internal nodes 213 containing duplicate keys, however.
  • the search method within each node preferably is adapted so that the leftmost key will always be located. This ensures that the leaf nodes 214 in the deepest level of the CSS-tree will never be reached using an index which is out of the range of the first portion of the sorted array 216.
  • a search may be performed to locate any key value stored in sorted array 216.
  • the search preferably begins with the root node 200, which in the exemplary embodiment depicted in Fig. 2a is designated as node number 0.
  • a conventional binary search preferably is performed to determine to which child node to branch next. This process preferably is continued until a leaf node 214 is reached. This leaf node 214 is then preferably mapped into the appropriate node in the sorted array 216, where a conventional binary search is preferably performed to locate the desired key value within the node being searched.
  • Input the sorted array (a), the array consisting of the internal nodes of a full CSS-tree (b), number of elements in the sorted array (n). last internal node number (LNode). index of the first entry of the leftmost leaf node in the bottom level in a CSS directory array. (MARK).
  • Output the index of the matching key in array a: if the key is found;
  • d be the node number of the ith child of d.
  • the binary searches performed within a particular node preferably are implemented using so- called hard-coded "if-else" statements.
  • it preferably is determined whether the key values stored in the left portion of that node are greater than or equal to the value of the key value being searched.
  • the search within each internal node 213 is ceased when the first key is located which has a value less than the value of the key value being searched.
  • the rightmost branch corresponding to this located key is then followed. If such a key value cannot be located, the leftmost branch is then followed.
  • this preferred approach ensures that if duplicate key values are located in a single node, the leftmost key value among these duplicates is located.
  • this method can be used to locate the leftmost key value among all duplicates in a single node.
  • level CSS-trees are similar to full CSS-trees.
  • m is defined to be the number of entries in each node within the CSS-tree indexing structure.
  • the value of m is also known as the branch factor for the CSS-tree.
  • level CSS-trees are deeper than full CSS-trees.
  • level CSS-trees can be constructed with nodes requiring on average fewer comparisons than are required by nodes in full CSS-trees, in the special case where m is an integer power of 2.
  • a full CSS-tree having m entries per node preferably will contain exactly m keys per node. Thus, all of the entries in each node are fully utilized in a full CSS-tree.
  • an exemplary level CSS-tree In contrast, in an exemplary level CSS-tree, one of the eight entries per node is not utilized, and there are therefore only seven, instead of eight, key entries per node.
  • the exemplary level CSS-tree can be distinguished from the exemplary full CSS-tree configuration depicted in Fig. 3, where all key entries within a node contain a key value.
  • each branch in the exemplary level CSS-tree will advantageously require only three comparisons.
  • a level CSS-tree preferably utilizes only m - 1 entries per node, and therefore has a branching factor of m.
  • a typical level CSS-tree will thus be deeper than a typical full CSS-tree having the same node size, because the branching factor for a level CSS-tree is m rather than m + 1 as for a full CSS-tree. In an exemplary level CSS-tree, however, fewer comparisons must be performed for any individual node. If Nis defined to be the number of nodes required to contain all of the elements in the sorted array 216, an exemplary level CSS-tree has log taste,Ntree levels, whereas an exemplary full CSS-tree has log m+] Ntree levels. The number of comparisons which must be performed for each individual node equals t for the level
  • N * t log 2 N. For the exemplary full CSS-tree this value is equal to log m+1 N * t *
  • level CSS-tree typically requires fewer comparisons than does a typical full CSS-tree.
  • level CSS-trees may typically require log m N cache accesses and log m N node traversals to complete a search, as compared to only log m __ ; N cache accesses and log m __ ; Nnode traversals for full CSS-trees.
  • the optimal choice between full CSS-trees and level CSS-trees depends on the relative speed and efficiency of comparison operations, node traversals and cache accesses.
  • level CSS-tree utilize most of the data stored in each cache line.
  • FIG. 9 depicts an exemplary flow diagram for constructing an exemplary level or full CSS-tree of the present invention.
  • a sorted array 216 containing n key values, and the number of entries per node, which is denoted as m. From these parameters, the total number of internal nodes 213 is then determined.
  • "Mark" is then determined. As discussed above, "Mark” is defined as the number of the leaf node in group of leaf nodes 214 which contains the lowest- numbered key in the deepest leaf node level in the CSS-tree.
  • the right-most branch path is followed until a leaf node 214 is reached. Then, as shown in block 903, for that leaf node the mapping from that leaf node number to the corresponding node in the sorted array 216 of key values is performed, by determining the appropriate offset necessary to locate the corresponding node in the sorted array 216 of key values. This offset is determined by first comparing the node number of that leaf node with "Mark.” If the leaf node number is greater than "Mark,” the difference between the leaf node number and "Mark” serves as the offset from the beginning of the sorted array 216 of key values.
  • Fig. 10 depicts an exemplary flow diagram for searching for a data record having a desired key value, utilizing an exemplary level or full CSS-tree of the present invention. As shown in block 1000, there is first provided a particular key value to be searched.
  • a search is performed across the internal nodes in that tree level. This is accomplished by performing a binary search within the internal nodes on that level. As shown in block 1001, this process is continued until it is determined which node at that tree level contains the desired key value within its stored range of key values.
  • the offset necessary to locate the corresponding node in the sorted array 216 of key values to which the leaf node number of that leaf node is mapped is then determined. This offset is determined by first comparing the node number of that leaf node with "Mark.” If the leaf node number is greater than "Mark,” the difference between the leaf node number and "Mark” is the offset from the beginning of the sorted array 216 of key values.
  • the difference between "Mark” and the leaf node number is the offset from the end of the sorted array 216 of key values.
  • the offset determined above is utilized to obtain the key value from the sorted array 216. From this located key value, the data record corresponding to the located key value can be easily located using any conventional method. If on the other hand the searched-for key value does not exist in the sorted array 216 of key values, this denotes that there is no data record having a key value matching the searched-for key value. In such an event, a message to this effect typically is provided.
  • m denotes the number of keys per node
  • R denotes the amount of main memory space consumed by a record identifier
  • K denotes the amount of main memory space consumed by a key
  • P denotes the amount of main memory space consumed by a child pointer
  • n denotes the number of individual records being indexed in sorted array 216
  • h denotes a hashing factor, which may typically be 1.2, thereby indicating that a hash table typically is approximately 20% larger than the raw data contained in the hash table
  • c denotes the size, in number of bytes, of a cache line
  • s denotes the size, measured in number of cache lines, of a node in a T-tree, CSS-tree or enhanced B+-tree.
  • Table la shows the exemplary branching factor, c number of tree levels, number of comparisons per internal node, and number of comparisons per leaf node for each searching method shown.
  • enhanced B+-trees are characterized by a smaller branching factor than are CSS-trees; this is because B+-trees store child pointers expressly.
  • the total cost in time of each searching method has three primary components: namely, the comparison cost, the cost associated with moving across the different levels of the tree, and the cache miss cost.
  • Table lb depicts an exemplary comparison of these three costs for each search method shown.
  • D denotes the cost of de-referencing a pointer; and
  • a b , A fcss , A lcss denote the cost in time of computing a child address for a binary search, full CSS-tree search and level CSS-tree search, respectively.
  • the respective exemplary comparison costs are relatively similar for all of the search methods shown, including searches utilizing full CSS-trees.
  • the number of comparisons associated with searching using full CSS-trees typically is slightly higher than for searching using level CSS- trees.
  • Certain search methods determine child nodes by following pointers, whereas others do so using arithmetic calculations.
  • the relative comparison costs are a function of the complexity of the computations necessary to perform the comparison, and the efficiency of the hardware used. For example, although A b may be less than D, A fcss likely will be greater than D. Nevertheless, searching methods having higher branching factors also utilize relatively fewer tree levels, and therefore normally exhibit relatively lower costs of moving across tree levels. An overly large node size will increase the cache miss cost, however, which probably will constitute the overriding performance factor because each cache miss typically can be an order of magnitude more expensive than the computation of a child address.
  • the number of cache misses is minimized when the node size equals the cache line size.
  • the quantity of cache misses for binary and T-tree searching is independent of m.
  • searching performed in connection with enhanced B+-trees and CSS-trees typically generates only a fraction of the cache misses generated by a binary search, with CSS-trees typically exhibiting better performance than enhanced B+-trees in this regard.
  • CSS-trees typically exhibit the lowest number of cache misses of any of the searching methods shown.
  • the performance associated with searching performed using the exemplary CSS-trees normally should be significantly better than that associated with binary searching, T-tree searching, and/or enhanced B+-tree searching.
  • the highest level CSS-tree nodes will remain resident in cache, thereby improving performance.
  • CSS-trees typically have fewer tree levels than do trees associated with the other types of search methods, CSS-trees will also benefit the most from a warm cache startup, wherein the cache begins in a non- empty state.
  • Table 2 below provides a summary of the respective exemplary memory space requirements for each of the exemplary searching methods shown in Tables la and lb, as well as hash table searching.
  • the column entitled "Space (indirect)" denotes the exemplary memory space requirements of the exemplary search methods shown, assuming that the structure being indexed constitutes a collection of record identifiers which can be rearranged if necessary. That is to say, the expressions denoted in this column assume that it is acceptable for the particular search method to store the record identifiers internally within the search tree structure, as opposed to leaving the record identifiers in the form of an unaltered contiguous list.
  • the column in Table 2 entitled "Space (direct)” denotes the exemplary main memory space requirements of the search methods shown, assuming that the structure being indexed constitutes a collection of records that cannot be rearranged in such a way. That is to say, the expressions denoted in this column assume that it is not acceptable for the particular search method to store the records internally within the search tree structure. Thus for the T-tree and hash table searching methods, the amount of memory space consumed by the record identifiers is included in the expressions appearing in this column of Table 2, because the other exemplary search methods shown do not require these record identifiers in such a scenario.
  • the column in Table 2 appearing immediately to the right of the column entitled "Space (direct)” denotes typical exemplary memory space requirement values for this arrangement.
  • FIGs. 4a and 4b depict a comparison of typical exemplary memory space requirements as a function of sorted array size n, for the exemplary search methods shown in Table 2.
  • Fig. 4a depicts these memory space requirements corresponding to the above-discussed Table 2 column entitled “Space (indirect).
  • Fig. 4b depicts these memory space requirements corresponding to the above- discussed Table 2 column entitled “Space (direct).”
  • the exemplary hash tables and T-trees consume substantially more memory space than do the exemplary CSS-trees.
  • Exemplary performance comparisons between database searching performed using exemplary preferred embodiments of the CSS-trees of the present invention and other types of searching methods, such as those shown in Table 2, are presented in Figs. 5 - 8, below.
  • These exemplary preferred embodiments utilize two exemplary preferred modern platforms, and performance is considered as a function of the time required to perform a large number of successful random lookups to the particular index and array of values being considered.
  • the two exemplary modern platforms considered are a Sun Microsystems Ultra Sparc II machine (preferably operating at 296 MHZ and having 1 GB of RAM) and a Pentium II personal computer (preferably operating at 333 MHZ and having 128 MB of RAM).
  • the exemplary Ultra Sparc II machine preferably utilizes a 16 KB on-chip cache having a 32 byte cache line size and an associativity of 1, as well as a 1MB secondary level cache having a 64 byte cache line size and an associativity of 1.
  • the exemplary Pentium II machine preferably utilizes a 16 KB on-chip cache having a 32 byte cache line size and an associativity of 4, as well as a 512 KB secondary level cache having a 32 byte cache line size and an associativity of 4. Both exemplary machines preferably utilize an exemplary Sun Microsystems Solaris 2.6 operating system.
  • the following searching methods were implemented using the preferred C++ programming language: exemplary full CSS-tree and level CSS-tree searching methods of the present invention, as well as conventional chained bucket hashing searching, array binary searching, tree binary searching, T-tree searching, enhanced B+-tree searching, and well-known interpolation searching set forth in W.W. Peterson, IBM J. Research & Development, No. 1, pp. 131-132, 1957, the contents of which is incorporated herein by reference. All keys utilized preferably are selected randomly from exemplary integers ranging between 0 and 1,000,000. Each key preferably consumes 4 bytes of memory. All lookup keys preferably are generated in advance to prevent the key generation time from impacting the recorded performance results. An exemplary total of 100,000 searches were performed on randomly selected matching key values.
  • varying node sizes are preferably implemented by allocating a large block of memory to reduce allocation time.
  • logical shifts are preferably used in place of multiplication and division operations.
  • leaf node searches also are preferably hardcoded.
  • equality testing is preferably performed sequentially on each key.
  • the sorted array being searched preferably is properly aligned accordingly to the cache line size being used.
  • all of the tree nodes preferably are allocated at the same time, and the starting addresses thereof properly aligned.
  • Fig. 5 depicts the time required to build both an exemplary full CSS-tree and an exemplary level CSS-tree as a function of the size of the sorted array 216 to be searched.
  • the building time for each type of exemplary CSS-tree typically increases linearly as a function of the sorted array size.
  • the size of the sorted array preferably is varied, whereas the node size preferably is fixed to one of two sizes corresponding to the cache line size in each of the two levels of cache in the Ultra Sparc II machine (i.e., preferably 32 bytes and 64 bytes).
  • Figs. 6a and 6b depict exemplary search performance results for the exemplary preferred embodiments implemented on the preferred Ultra Sparc II machine.
  • Fig. 6a corresponds to the specific example wherein the node size is 32 bytes
  • Fig. 6b corresponds to the specific example wherein the node size is 64 bytes.
  • the exemplary CSS-tree search methods of the present invention perform better than the conventional searching methods, with the exception of hashing.
  • Figs. 7a and 7b depict the number of first and second level cache misses, respectively, for an exemplary cache preferably configured to simulate the exemplary preferred cache of the Ultra Sparc II machine (wherein the node size preferably is configured to be 64 bytes). As depicted in Figs.
  • the size of the sorted array preferably is fixed, whereas the node size preferably is varied.
  • Fig. 8 depicts the search performance results for this example run on the preferred Ultra II machine. As depicted in Fig. 8, in this example the smallest preferred node size for the CSS-trees of the present invention is 16 integers per node, which corresponds to the preferred Ultra Sparc II machine's preferred 64 byte secondary cache size.

Abstract

L'invention concerne des structures d'indexation à arbre de recherche sensible aux caractéristiques d'antémémoire assurant une recherche et une consultation améliorées par rapport aux systèmes de recherche classiques. Ces structures comprennent une structure à arborescence de répertoire enregistrée dans une matrice (216) et tenant lieu d'index pour une série ordonnée d'éléments. Les noeuds (215) de la structure à arborescence de répertoire peuvent avoir une taille choisie spécifiquement pour correspondre à la taille de ligne d'antémémoire du système informatique utilisant les structures considérées. Les emplacements des noeuds fils (213) de la structure à arborescence de répertoire sont déterminés par le biais d'opérations arithmétiques sur les décalages matriciels, ce qui rend superflu l'enregistrement de pointeurs de noeuds fils internes, réduisant donc les besoins d'enregistrement en mémoire. Par ailleurs, les structures décrites sont établies de sorte que le parcours de chaque niveau dans l'arbre donne une localité de référence de données de bonne qualité, produisant ainsi un nombre relativement réduit d'opérations manquées dans l'antémémoire. Il apparaît donc que les structures en question sont sensibles à des paramètres propres à l'antémémoire du type localité de référence et comportement d'antémémoire, sans recourir à des volumes importants de mémoire supplémentaire.
PCT/US1999/028430 1999-12-01 1999-12-01 Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire WO2001040996A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/US1999/028430 WO2001040996A1 (fr) 1999-12-01 1999-12-01 Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire
US09/600,266 US6711562B1 (en) 1999-12-01 1999-12-01 Cache sensitive search (CSS) tree indexing system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US1999/028430 WO2001040996A1 (fr) 1999-12-01 1999-12-01 Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire

Publications (1)

Publication Number Publication Date
WO2001040996A1 true WO2001040996A1 (fr) 2001-06-07

Family

ID=22274191

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1999/028430 WO2001040996A1 (fr) 1999-12-01 1999-12-01 Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire

Country Status (1)

Country Link
WO (1) WO2001040996A1 (fr)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100488414B1 (ko) * 2000-12-30 2005-05-11 한국전자통신연구원 다중탐색 트리의 노드 생성 방법, 및 그에 따라 생성된 다중탐색 트리 구조의 자료 탐색 방법
GB2419700A (en) * 2004-10-29 2006-05-03 Hewlett Packard Development Co Methods for indexing data in a content repository
CN107797941A (zh) * 2016-09-06 2018-03-13 华为技术有限公司 针对查找树的缓存着色内存分配方法和装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5283894A (en) * 1986-04-11 1994-02-01 Deran Roger L Lockless concurrent B-tree index meta access method for cached nodes
US5822749A (en) * 1994-07-12 1998-10-13 Sybase, Inc. Database system with methods for improving query performance with cache optimization strategies
US5826253A (en) * 1995-07-26 1998-10-20 Borland International, Inc. Database system with methodology for notifying clients of any additions, deletions, or modifications occurring at the database server which affect validity of a range of data records cached in local memory buffers of clients
US5940838A (en) * 1997-07-11 1999-08-17 International Business Machines Corporation Parallel file system and method anticipating cache usage patterns
US6047280A (en) * 1996-10-25 2000-04-04 Navigation Technologies Corporation Interface layer for navigation system
US6061678A (en) * 1997-10-31 2000-05-09 Oracle Corporation Approach for managing access to large objects in database systems using large object indexes

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5283894A (en) * 1986-04-11 1994-02-01 Deran Roger L Lockless concurrent B-tree index meta access method for cached nodes
US5822749A (en) * 1994-07-12 1998-10-13 Sybase, Inc. Database system with methods for improving query performance with cache optimization strategies
US5826253A (en) * 1995-07-26 1998-10-20 Borland International, Inc. Database system with methodology for notifying clients of any additions, deletions, or modifications occurring at the database server which affect validity of a range of data records cached in local memory buffers of clients
US6047280A (en) * 1996-10-25 2000-04-04 Navigation Technologies Corporation Interface layer for navigation system
US5940838A (en) * 1997-07-11 1999-08-17 International Business Machines Corporation Parallel file system and method anticipating cache usage patterns
US6061678A (en) * 1997-10-31 2000-05-09 Oracle Corporation Approach for managing access to large objects in database systems using large object indexes

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100488414B1 (ko) * 2000-12-30 2005-05-11 한국전자통신연구원 다중탐색 트리의 노드 생성 방법, 및 그에 따라 생성된 다중탐색 트리 구조의 자료 탐색 방법
GB2419700A (en) * 2004-10-29 2006-05-03 Hewlett Packard Development Co Methods for indexing data in a content repository
GB2419700B (en) * 2004-10-29 2010-03-31 Hewlett Packard Development Co Methods for indexing data, systems, software and apparatus relng thereto
US8892564B2 (en) 2004-10-29 2014-11-18 Hewlett-Packard Development Company, L.P. Indexing for data having indexable and non-indexable parent nodes
CN107797941A (zh) * 2016-09-06 2018-03-13 华为技术有限公司 针对查找树的缓存着色内存分配方法和装置

Similar Documents

Publication Publication Date Title
US6711562B1 (en) Cache sensitive search (CSS) tree indexing system and method
Rao et al. Cache conscious indexing for decision-support in main memory
Lehman et al. A study of index structures for main memory database management systems
Guttman R-trees: A dynamic index structure for spatial searching
Boehm et al. Efficient in-memory indexing with generalized prefix trees
Jermaine et al. The partitioned exponential file for database storage management
Hutflesz et al. The R-file: An efficient access structure for proximity queries
WO2002071270A1 (fr) Programme de compression permettant d'ameliorer le comportement du cache dans les systemes de bases de donnees
Jermaine et al. A novel index supporting high volume data warehouse insertion
Behm et al. Answering approximate string queries on large data sets using external memory
Ooi Spatial kd-tree: A data structure for geographic database
US7499927B2 (en) Techniques for improving memory access patterns in tree-based data index structures
Cockshott et al. High-performance operations using a compressed database architecture
Amur et al. Design of a write-optimized data store
Botelho et al. Minimal perfect hashing: A competitive method for indexing internal memory
Wiener et al. OODB bulk loading revisited: The partitioned-list approach
Blankenagel et al. External segment trees
Ghanem et al. Bulk operations for space-partitioning trees
Zhang et al. Improving min/max aggregation over spatial objects
Lee et al. Cst-trees: cache sensitive t-trees
Alam et al. Performance of point and range queries for in-memory databases using radix trees on GPUs
WO2001040996A1 (fr) Systeme et procede d'indexation a arbre de recherche sensible aux caracteristiques d'antememoire
Hammer et al. Data structures for databases
Liu et al. Pea hash: a performant extendible adaptive hashing index
Ross et al. Cost-based unbalanced R-trees

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CA US

WWE Wipo information: entry into national phase

Ref document number: 09600266

Country of ref document: US