US20090248648A1 - Method for evaluating a conjunction of equity and range predicates using a constant number of operations - Google Patents

Method for evaluating a conjunction of equity and range predicates using a constant number of operations Download PDF

Info

Publication number
US20090248648A1
US20090248648A1 US12/056,999 US5699908A US2009248648A1 US 20090248648 A1 US20090248648 A1 US 20090248648A1 US 5699908 A US5699908 A US 5699908A US 2009248648 A1 US2009248648 A1 US 2009248648A1
Authority
US
United States
Prior art keywords
bit
xor
bits
computer
fields
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US12/056,999
Other versions
US7840554B2 (en
Inventor
F. Ryan Johnson
Vijayshankar Raman
Garret Frederick Swart
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/056,999 priority Critical patent/US7840554B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: SWART, GARRET FREDERICK, JOHNSON, F RYAN, RAMAN, VIJAYSHANKAR
Publication of US20090248648A1 publication Critical patent/US20090248648A1/en
Application granted granted Critical
Publication of US7840554B2 publication Critical patent/US7840554B2/en
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries

Definitions

  • the present invention relates generally to the field of predicate evaluation. More specifically, the present invention is related to method for evaluating a conjunction of equity and range predicates using a constant number of operations.
  • Conjunctive predicates e.g., p1 AND p2 AND . . .
  • p1 AND p2 AND . . . are the most common kind of predicate used in querying databases.
  • the standard way to evaluate a conjunction of predicates on a record is via a method of the form:
  • the method of this embodiment comprises the steps of: (a) constructing a mask to extract values of the fields F 1 , F 2 , . . . F k within a cell, wherein the mask comprises a bit vector M having 1s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 0s in remainder of bits; (b) constructing a value vector comprising a bit vector V having values L 1 , L 2 . . . L k at bit positions [S 1 , E 1 ], [S 2 , E 2 ] . . .
  • the bit vector M has 0s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and 1s in remaining bits
  • the present invention provides a computer based method to simultaneously evaluate conjunctions of range and equality predicates on k fields of a record being either F 1 , F 2 , . . . F k : F 1 ⁇ L 1 and F 2 ⁇ L 2 and F k ⁇ L k , or F 1 , F 2 , . . . F k : F 1 ⁇ L 1 and F 2 ⁇ L 2 and F k ⁇ L k , wherein L 1 , L 2 . . . L k represent values and fields F 1 , F 2 , . . . F k are at offsets [S 1 , E 1 ], [S 2 , E 2 ] . . .
  • the value vector comprises a bit vector V having values L 1 , L 2 . . . L k at bit positions [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ], respectively, and having 0s in remainder of bits; (d) for each record, R, on which the predicates need to be applied, evaluating as follows:
  • first value vector containing lower bound values wherein the first value vector comprises a bit vector V L having 0s everywhere except one or more of the following: L 1 at bit positions [S 1 , E 1 ], L 2 at bit positions [S 2 , E 2 ] . . . L k at bit positions [S k , E k ]; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector V U having 1s everywhere except one or more of the following: U 1 at bit positions [S 1 , E 1 ], U 2 at bit positions [S 2 , E 2 ] . . .
  • the above-described method further comprises the step of precomputing (V U XOR V L ) AND M, and evaluating remainder of expression in (e) on a per record basis.
  • the present invention in another embodiment, provides for a computer based method to simultaneously evaluate conjunctions of a mixture of in-list predicates on k fields F 1 , F 2 , . . . F k of the form F 1 in (L 11 , L 12 . . . L 1n ) and F 2 in (L 21 , L 22 . . . L 2n ) and . . . F k in (L k1 , L k2 . . . L kn ), wherein the k fields are at offsets [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ].
  • the method as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) constructing a first mask to extract values of k fields, wherein the first mask comprises a bit vector M having 1s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 0s in remainder of bits; (b) constructing a second mask to extract most significant bit of each field, wherein the second mask comprises a bit vector S having 0s in bits S 1 , S 2 , . . .
  • N 1 (((((V 1 XOR R) AND S)+S) OR (V 1 XOR R));
  • N 2 (((((V 2 XOR R) ANDS)+S) OR (V 2 XOR R)); . . .
  • N n ((((V n XOR R) AND S)+S) OR (V n XOR R));
  • FIGS. 1-3 illustrate various examples of the first embodiment's computer-based method to simultaneously evaluate conjunctions of equality predicates.
  • FIGS. 4-9 illustrate various examples of the second embodiment's computer-based method to simultaneously evaluate range and equality predicates.
  • FIG. 10 illustrates an example of the third embodiment's computer-based method to simultaneously evaluate a mixture of in-list predicates.
  • Conjunction refers to all of the clauses that have to be simultaneously true for the overall condition to be satisfied. Such conjunctions are the most common kinds of predicates that occur in databases, search engines, etc.
  • the present invention's methods apply conjunctions simultaneously on fields in a database.
  • the present invention is based on the following conditions, wherein these conditions apply to many of the current databases:
  • N is usually set to be a machine word size, such as 8, 16, 32, or 64 bits and if R is too large to fit into a single machine word, it is broken up into multiple words).
  • Fields F 1 . . . F k are at bit offsets [S 1 , E 1 ], [S 2 , E 2 ], . . . [S k , E k ] respectively, i.e., the first field lies in bits S 1 through E 1 of the record, the second field lies in bits S 2 through E 2 of the record, and so on.
  • the corresponding literals are L 1 , L 2 , . . . L k .
  • the method of the first embodiment computes a bit-wise AND of the tuple with a pre-computed mask that has the literals at the same positions as the corresponding fields, and check if the result is equal to the mask.
  • Method 100 is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F 1 , F 2 , . . . F k within a cell, wherein the mask comprises a bit vector M having 1s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 0s in remainder of bits—step 102 ; (b) constructing a value vector comprising a bit vector V having the values L 1 , L 2 . . . L k at bit positions [S 1 , E 1 ], [S 2 , E 2 ] . . .
  • FIG. 2 illustrates such a variation of the method of the first embodiment.
  • Method 200 is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F 1 , F 2 , . . . F k within a cell, wherein the mask comprises a bit vector M having 0s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 1s in remainder of bits—step 202 ; (b) constructing a value vector comprising a bit vector V having the values L 1 , L 2 . . .
  • FIG. 3 illustrates yet another variation of the method of the first embodiment.
  • Method 300 is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F 1 , F 2 , . . . F k within a cell, wherein the mask comprises a bit vector M having 1s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 0s in remainder of bits—step 302 ; (b) constructing a value vector comprising a bit vector V having the values L 1 , L 2 . . .
  • the benefit of the methods of the first embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • bit-wise AND could be implemented by combinations of other operators, and such modifications are considered within the scope of the present invention.
  • FIG. 4 illustrates an example of the second embodiment's computer-based method to simultaneously evaluate conjunctions of range and equality predicates on k fields of a record being either F 1 , F 2 , . . . F k : F 1 ⁇ L 1 and F 2 ⁇ L 2 and F k ⁇ L k , or F 1 , F 2 , . . . F k : F 1 ⁇ L 1 and F 2 ⁇ L 2 and F k ⁇ L k , wherein L 1 , L 2 . . . L k represent values and the fields F 1 , F 2 , . . . F k are at offsets [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] within the record.
  • an offset [X, Y] represents all bits X through Y.
  • the value vector comprises a bit vector V having the values L 1 , L 2 . . . L k at bit positions [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ], respectively, and having 0s in remainder of bits—step 406 ; (d) for each record, R, on which the predicates need to be applied, evaluating as follows: when F 1 , F 2 , . . .
  • FIG. 7 illustrates yet another variation in the method of the second embodiment.
  • step 704 (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector V L having 0s everywhere except one or more of the following: L 1 at bit positions [S 1 , E 1 ], L 2 at bit positions [S 2 , E 2 ] . . . L k at bit positions [S k , E k ]—step 706 ; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having 1s everywhere except one or more of the following: U 1 at bit positions [S 1 , E 1 ], U 2 at bit positions [S 2 , E 2 ] . .
  • the value of ((V U XOR V L ) AND M) is pre-computed, such that only the remaining part of the expression (i.e., (((V U ⁇ R) XOR (R ⁇ V L )) AND M)) is evaluated on a per-record basis: using two subtractions, an XOR, a bit-wise ANDs and a bitwise comparison (all of which can be performed efficiently on most current processors using, for example, hardware instructions).
  • step 804 (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector V L having 0s everywhere except one or more of the following: L 1 at bit positions [S 1 , E 1 ], L 2 at bit positions [S 2 , E 2 ] . . . L k at bit positions [S k , E k ]—step 806 ; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having is everywhere except one or more of the following: U 1 at bit positions [S 1 , E 1 ], U 2 at bit positions [S 2 , E 2 ] . . .
  • step 904 (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector V L having 0s everywhere except one or more of the following: L 1 at bit positions [S 1 , E 1 ], L 2 at bit positions [S 2 , E 2 ] . . . L k at bit positions [S k , E k ]—step 906 ; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having 1s everywhere except one or more of the following: U 1 at bit positions [S 1 , E 1 ], U 2 at bit positions [S 2 , E 2 ] . .
  • the benefit of the methods of the second embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • FIG. 10 illustrates an example of the third embodiment's computer-based method to simultaneously evaluating conjunctions of a mixture of in-list predicates on k fields F 1 , F 2 , . . . F k of the form F 1 in (L 11 , L 12 . . . L 1n ) and F 2 in (L 21 , L 22 . . . L 2n ) and . . . F k in (L k1 , L k2 . . . L kn ), wherein the k fields are at offsets [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ].
  • Method 1000 of FIG. 10 comprises the steps of: (a) constructing a first mask to extract values of k fields, wherein the mask comprises a bit vector M having 1s in bits [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ] and having 0s in remainder of bits—step 1002 ; (b) constructing a second mask to extract most significant bit of each field, wherein the second mask comprises a bit vector S having 0s in bits S 1 , S 2 , . . .
  • step 1004 (c) for each 1 through n, computing a bit vector of values V 1 , V 2 , . . . , V n , wherein V i has 0s in all bits except values L 1i , L 2i , . . . . L ki at [S 1 , E 1 ], [S 2 , E 2 ] . . . [S k , E k ], respectively—step 1006 ; (d) for each record, R, on which said predicates need to be applied, evaluating n numbers as follows—step 1008 :
  • N 1 ((((V 1 XOR R) AND S)+S) OR (V XOR R));
  • N 2 (((((V 2 XOR R) ANDS)+S) OR (V 2 XOR R)); . . .
  • N n ((((V n XOR R) AND S)+S) OR (V XOR R));
  • said AND operator represents bit-wise AND of two bit vectors
  • said XOR operator represents bit-wise Exclusive OR of two bit vectors
  • + represents subtraction
  • OR represents bit-wise OR
  • the benefit of the method of the third embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • the k fields F 1 , F 2 , . . . F k described in the above-mentioned methods associated with embodiments 1 through 3 have a single codeword length.
  • evaluations described in the above-mentioned methods associated with embodiments 1 through 3 are computed exclusively via processor instructions.
  • the computer-based methods of embodiments 1 through 3 are used in constant-time query processing.
  • the present invention provides for an article of manufacture comprising computer readable program code contained within implementing one or more modules to implement each of the above described methods of FIGS. 1 through 10 .
  • the present invention includes a computer program code-based product, which is a storage medium having program code stored therein which can be used to instruct a computer to perform any of the methods associated with the present invention.
  • the computer storage medium includes any of, but is not limited to, the following: CD-ROM, DVD, magnetic tape, optical disc, hard drive, floppy disk, ferroelectric memory, flash memory, ferromagnetic memory, optical storage, charge coupled devices, magnetic or optical cards, smart cards, EEPROM, EPROM, RAM, ROM, DRAM, SRAM, SDRAM, or any other appropriate static or dynamic memory or data storage devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Complex Calculations (AREA)

Abstract

Methods are described to simultaneously apply conjugates of equality, range, and in-list predicates. A first set of methods are described for the simultaneous application of equality predicates. A second set of methods are described for the simultaneous application of a mixture of range and equality predicates. A third method is described for the simultaneous applying a mixture of in-list predicates. The described methods allow for quick evaluation of complex predicates as they efficiently implement the computation done per record, while maintaining the same execution time irrespective of the number of fields.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of Invention
  • The present invention relates generally to the field of predicate evaluation. More specifically, the present invention is related to method for evaluating a conjunction of equity and range predicates using a constant number of operations.
  • 2. Discussion of Related Art
  • Conjunctive predicates (e.g., p1 AND p2 AND . . . ) are the most common kind of predicate used in querying databases. The standard way to evaluate a conjunction of predicates on a record is via a method of the form:
  • for each predicate do:
      extract the fields that this predicate is over
      if record satisfies predicate continue
      else return that the record does not satisfy predicate
    // every predicate has been verified
    return that the record satisfies predicate
  • The performance of this predicate evaluation is a significant fraction of overall query performance in modern high-performance business intelligence (BI) engines that do large amounts of data scans. But at least three drawbacks make this standard method of predicate evaluation to be slow and have variable performance.
  • First, in prior art predicate evaluations, the evaluation time varies based on the number of predicates to be applied. For example, in the paper by Holloway et al. titled “How to Barter Bits for Chronons: Compression and Bandwidth Trade Offs for Database Scans”, it was found that each extra field that is touched adds about 6-8 cycles per record for a scan, which, in turn, causes variability in scan performance.
  • Second, in prior art predicate evaluations, each field needs to be extracted before predicates are applied. The cost associated with such an operation is expensive, especially in newer databases where fields are not aligned at machine-word (64-bit) boundaries.
  • Third, in prior art predicate evaluations, the loop condition and the predicate evaluation within the loop both result in conditional branch statements. Mispredicted branches cost orders of magnitude more than regular instructions on almost all modern processors (e.g., we have timed at 40 cycles on a Pentium® family processor).
  • Whatever the precise merits, features, and advantages of such prior art predicate evaluations, none of them achieves or fulfills the purposes of the present invention.
  • SUMMARY OF THE INVENTION
  • The present invention, in one embodiment, provides a computer-based method to simultaneously evaluate conjunctions of equality predicates on k fields F1, F2, . . . Fk: F1=L1 and F2=L2 and Fk=Lk of a record, wherein L1, L2 . . . Lk represent values and fields F1, F2, . . . Fk are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within the record. The method of this embodiment, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) constructing a mask to extract values of the fields F1, F2, . . . Fk within a cell, wherein the mask comprises a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits; (b) constructing a value vector comprising a bit vector V having values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits; (c) for each record, R, on which said equality predicates need to be applied, evaluating if R AND M=V; and (d) outputting results of said evaluation in (c).
  • In one variation, the bit vector M has 0s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and 1s in remaining bits, and the bit vector V has values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 1s in remainder of bits, and instead of evaluating if R AND M=V, the method comprises the step of evaluating if R OR M=V.
  • In another variation of the above-described method, instead of evaluating if R AND M=V, the method evaluates if (R XOR V) AND M=0.
  • In another embodiment, the present invention provides a computer based method to simultaneously evaluate conjunctions of range and equality predicates on k fields of a record being either F1, F2, . . . Fk: F1≦L1 and F2≦L2 and Fk≦Lk, or F1, F2, . . . Fk: F1≧L1 and F2≧L2 and Fk≧Lk, wherein L1, L2 . . . Lk represent values and fields F1, F2, . . . Fk are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within the record. In this embodiment, the method, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for said k fields, wherein C is a constant whole number (e.g., C=1); (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, Bk having 0s in remainder of bits; (c) constructing a value vector containing values of fields F1, F2, . . . Fk, wherein the value vector comprises a bit vector V having values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits; (d) for each record, R, on which the predicates need to be applied, evaluating as follows:
  • when F1, F2, ... Fk: F1 ≦ L1 and F2 ≦ L2 and Fk ≦ Lk,
     evaluating ((V−R) AND M) = (V XOR R) AND M), or
    when F1, F2, ... Fk: F1 ≧ L1 and F2 ≧ L2 and Fk ≧ Lk ,
     evaluating ((R−V) AND M) = (V XOR R) AND M),

    wherein the AND operator represents bit-wise AND of two bit vectors and said XOR operator represents bit-wise Exclusive OR of two bit vectors; and (e) outputting results of said evaluation operation in (d).
  • In one variation of the above-described embodiment, the method in step (d), instead of evaluating if ((V−R) AND M)=(V XOR R) AND M), evaluates if (((V−R) XOR V XOR R) AND M)=0.
  • In another variation of the above-described embodiment, the method in step (d), instead of evaluating if ((V−R) AND M)=(VXOR R) AND M), evaluates if ((V−R) XOR VXOR R) OR (NOT M)=(NOT M).
  • The present invention in another embodiment provides for a computer based method to simultaneously evaluate conjunctions of one or more range or equality predicates on k fields F1, F2, . . . Fk of a record, where each predicate is one of four forms: (i) Li≦Fi or (ii) Fi≦Ui or (iii) Fi=Li or (iv) Li≦F1≦Ui, and wherein L1, L2 . . . Lk represent values and the k fields being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] of the record. The method of this embodiment comprises the steps of: (a) computing Bi=Si−C for the k fields, wherein C is a constant whole number (e.g., C=1); (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits; (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector VL having 0s everywhere except one or more of the following: L1 at bit positions [S1, E1], L2 at bit positions [S2, E2] . . . Lk at bit positions [Sk, Ek]; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector VU having 1s everywhere except one or more of the following: U1 at bit positions [S1, E1], U2 at bit positions [S2, E2] . . . Uk at bit positions [Sk, Ek]; (e) for each record, R, on which said predicates need to be applied, evaluating (((VU−R) XOR (R−VL)) AND M)=(VU XOR VL) AND M), wherein the AND operator represents bit-wise AND of two bit vectors and said XOR operator represents bit-wise Exclusive OR of two bit vectors; and (f) outputting results of the evaluation operation in (e).
  • In an extended embodiment, the above-described method further comprises the step of precomputing (VU XOR VL) AND M, and evaluating remainder of expression in (e) on a per record basis.
  • In a variation to the above-described embodiment, the method, in step (e), evaluates if ((VU−R) XOR (R−VL) XOR VuXOR VL) AND M=0.
  • In another variation to the above-described embodiment, the method, in step (e), evaluates if ((VU−R) XOR (R−VL) XOR VU XOR VL) OR (NOT M)=(NOT M).
  • The present invention, in another embodiment, provides for a computer based method to simultaneously evaluate conjunctions of a mixture of in-list predicates on k fields F1, F2, . . . Fk of the form F1 in (L11, L12 . . . L1n) and F2 in (L21, L22 . . . L2n) and . . . Fk in (Lk1, Lk2 . . . Lkn), wherein the k fields are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek]. In this embodiment, the method, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) constructing a first mask to extract values of k fields, wherein the first mask comprises a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits; (b) constructing a second mask to extract most significant bit of each field, wherein the second mask comprises a bit vector S having 0s in bits S1, S2, . . . , Sk and having 1s in remainder of bits; (c) for each 1 through n, computing a bit vector of values V1, V2, . . . , Vn, wherein Vi has 0s in all bits except values L1i, L2i, . . . . Lki at [S1, E1], [S2, E2] . . . [Sk, Ek], respectively; (d) for each record, R, on which said predicates need to be applied, evaluating n numbers as follows:
  • N1=((((V1 XOR R) AND S)+S) OR (V1 XOR R));
  • N2=((((V2 XOR R) ANDS)+S) OR (V2 XOR R)); . . .
  • Nn=((((Vn XOR R) AND S)+S) OR (Vn XOR R));
  • and then evaluating the following condition:

  • ((N1 AND N2 AND . . . Nn) OR S)=S
  • wherein the AND operator represents bit-wise AND of two bit vectors, the XOR operator represents bit-wise Exclusive OR of two bit vectors, and the + operator represents addition, and OR represents bit-wise OR; and (e) outputting results of said evaluation operation in (d).
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIGS. 1-3 illustrate various examples of the first embodiment's computer-based method to simultaneously evaluate conjunctions of equality predicates.
  • FIGS. 4-9 illustrate various examples of the second embodiment's computer-based method to simultaneously evaluate range and equality predicates.
  • FIG. 10 illustrates an example of the third embodiment's computer-based method to simultaneously evaluate a mixture of in-list predicates.
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • The paper by Raman et al. entitled, “Constant-Time Query Processing,” to be published in the Proceedings of the 24th IEEE International Conference on Data Engineering, held Apr. 7-12, 2008, in Cancun, Mexico, attached in Appendix A, provides additional details regarding a simplified database architecture that achieves constant time query processing.
  • While this invention is illustrated and described in a preferred embodiment, the invention may be produced in many different configurations. There is depicted in the drawings, and will herein be described in detail, a preferred embodiment of the invention, with the understanding that the present disclosure is to be considered as an exemplification of the principles of the invention and the associated functional specifications for its construction and is not intended to limit the invention to the embodiment illustrated. Those skilled in the art will envision many other possible variations within the scope of the present invention.
  • The present invention teaches a method to efficiently apply conjunctions of one or more predicates (a predicate is a condition such as weight<=150) on fields in a database, such as:
  • age>10 AND
    salary between [10000,20002] AND
    state in (‘CA’, ‘MI’) AND
    hairColor in (‘black’, ‘blue’, ‘orange’) AND
    weight< 150 AND
    shoeSize = 10
  • In the example above, conditions such as ‘shoeSize=10’ are called equality predicates and conditions such as ‘age>10’ and ‘salary between [10000,20002]’ are called range predicates. Conditions such as ‘hairColor in (‘black’, ‘blue’, ‘orange’)’ are called in-list predicates.
  • Conjunction refers to all of the clauses that have to be simultaneously true for the overall condition to be satisfied. Such conjunctions are the most common kinds of predicates that occur in databases, search engines, etc.
  • It should be noted that the present invention's methods apply even in instances where the conjunction is only a part of the overall predicate. For example:
  • (age>10 AND salary between 10000 and 20002) AND
    ((weight< 150 AND shoeSize = 10) OR (state in (‘CA’, ‘MI’) AND
    (hairColor in (‘black’, ‘blue’, ‘orange’)))
  • In the above-example, the overall predicate is not a conjunction because of the ORs, but our methods apply to each of the underlined parts.
  • The present invention's methods apply conjunctions simultaneously on fields in a database. The present invention is based on the following conditions, wherein these conditions apply to many of the current databases:
      • the fields involved in the predicate are at fixed offsets within each record (if some fields are not at fixed offsets, our methods still apply to the part of the predicate that is on fields at fixed offsets)
      • the predicates can be evaluated on the fields as they are represented within the record.
    First Embodiment Applying Equality Predicates Simultaneously
  • Treat each record R as a single bit-vector of N bits (N is usually set to be a machine word size, such as 8, 16, 32, or 64 bits and if R is too large to fit into a single machine word, it is broken up into multiple words). If equality predicates are evaluated on k fields F1 . . . Fk: F1=L1 and F2=L2 and . . . Fk=Lk (the fields F1 . . . Fk are the attributes of the record, such as shoeSize in the previous example and L1 . . . Lk are the corresponding constants, such as 10 for the condition ‘shoeSize=10’, wherein such constants are also referred to as literals). Fields F1 . . . Fk are at bit offsets [S1, E1], [S2, E2], . . . [Sk, Ek] respectively, i.e., the first field lies in bits S1 through E1 of the record, the second field lies in bits S2 through E2 of the record, and so on. The corresponding literals are L1, L2, . . . Lk. The method of the first embodiment, computes a bit-wise AND of the tuple with a pre-computed mask that has the literals at the same positions as the corresponding fields, and check if the result is equal to the mask.
  • FIG. 1 illustrates an example of the first embodiment's computer-based method to simultaneously evaluate conjunctions of equality predicates on k fields F1, F2, . . . Fk: F1=L1 and F2=L2 and Fk=Lk of a record, with L1, L2 . . . Lk representing values and fields F1, F2, . . . Fk being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within the record, wherein an offset [X, Y] represents bits X through Y.
  • Method 100, according to this example, is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F1, F2, . . . Fk within a cell, wherein the mask comprises a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits—step 102; (b) constructing a value vector comprising a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits—step 104; (c) for each record, R, on which the equality predicates need to be applied, evaluating if R AND M=V—step 106; and (d) outputting results of the evaluation in (c)—step 108.
  • Variations of the method of FIG. 1 are envisioned and are within the scope of the present invention.
  • For example, FIG. 2 illustrates such a variation of the method of the first embodiment. Method 200, according to this example, is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F1, F2, . . . Fk within a cell, wherein the mask comprises a bit vector M having 0s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 1s in remainder of bits—step 202; (b) constructing a value vector comprising a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 1s in remainder of bits—step 204; (c) for each record, R, on which the equality predicates need to be applied, evaluating if R OR M=V—step 206; and (d) outputting results of the evaluation in (c)—step 208.
  • FIG. 3 illustrates yet another variation of the method of the first embodiment. Method 300, according to this example, is implemented in computer readable program code stored in computer storage and comprises the steps of: (a) constructing a mask to extract values of said fields F1, F2, . . . Fk within a cell, wherein the mask comprises a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits—step 302; (b) constructing a value vector comprising a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits—step 304; (c) for each record, R, on which the equality predicates need to be applied, evaluating if (R XOR V) AND M=0—step 306; and (d) outputting results of the evaluation in (c)—step 308.
  • The benefit of the methods of the first embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • It should be noted that bit-wise AND could be implemented by combinations of other operators, and such modifications are considered within the scope of the present invention.
  • Second Embodiment Applying a Mixture of Range and Equality Predicates Simultaneously
  • In this embodiment, equality predicates (such as ‘shoeSize=10’) are rewritten into range predicates such as shoeSize<=10 and shoeSize>=10. Predicates such as ‘weight<150’ are rewritten into predicates of the form ‘weight<=149’ by subtracting 1. Conjunction of predicates are of two forms: field<=literal, field>=literal.
  • FIG. 4 illustrates an example of the second embodiment's computer-based method to simultaneously evaluate conjunctions of range and equality predicates on k fields of a record being either F1, F2, . . . Fk: F1≦L1 and F2≦L2 and Fk≦Lk, or F1, F2, . . . Fk: F1≧L1 and F2≧L2 and Fk≧Lk, wherein L1, L2 . . . Lk represent values and the fields F1, F2, . . . Fk are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within the record. As mentioned earlier, an offset [X, Y] represents all bits X through Y.
  • Method 400 of FIG. 4, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 402; (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 404; (c) constructing a value vector containing the values of fields F1, F2, . . . Fk, wherein the value vector comprises a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits—step 406; (d) for each record, R, on which the predicates need to be applied, evaluating as follows: when F1, F2, . . . Fk: F1≦L1 and F2≦L2 and Fk≦Lk, evaluating ((V−R) AND M)=(V XOR R) AND M), or, when F1, F2, . . . Fk: F1≧L1 and F2≧L2 and Fk≧Lk, evaluating ((R−P) AND M)=(V XOR R) AND M), wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors—step 408; and (e) outputting results of the evaluation operation in (d)—step 410.
  • Variations of the method of FIG. 4 are envisioned and are within the scope of the present invention.
  • For example, method 500 of FIG. 5, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 502; (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 504; (c) constructing a value vector containing the values of fields F1, F2, . . . Fk, wherein the value vector comprises a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits—step 506; (d) for each record, R, on which the predicates need to be applied, evaluating (((V−R) XOR V XOR R) AND M)=0, wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors—step 508; and (e) outputting results of the evaluation operation in (d)—step 510.
  • As another example, method 600 of FIG. 6, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 602; (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 604; (c) constructing a value vector containing the values of fields F1, F2, . . . Fk, wherein the value vector comprises a bit vector V having the values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits—step 606; (d) for each record, R, on which the predicates need to be applied, evaluating ((V−R) XOR V XOR R) OR (NOT M)=(NOT M), wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors—step 608; and (e) outputting results of the evaluation operation in (d)—step 610.
  • FIG. 7 illustrates yet another variation in the method of the second embodiment. FIG. 7 illustrates a computer based method to simultaneously evaluate conjunctions of one or more range or equality predicates on k fields F1, F2, . . . Fk of a record, where each predicate is one of four forms: (i) Li≦Fi or (ii) Fi≦Ui or (iii) Fi=Li or (iv) Li≦Fi≦Ui, wherein L1, L2 . . . Lk represent values, and the k fields are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] of said record.
  • Method 700 of FIG. 7, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 702; (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 704; (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector VL having 0s everywhere except one or more of the following: L1 at bit positions [S1, E1], L2 at bit positions [S2, E2] . . . Lk at bit positions [Sk, Ek]—step 706; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having 1s everywhere except one or more of the following: U1 at bit positions [S1, E1], U2 at bit positions [S2, E2] . . . Uk at bit positions [Sk, Ek]—step 708; (e) for each record, R, on which said predicates need to be applied, evaluating (((VU−R) XOR (R−VL)) AND M)=((VU XOR VL) AND M)—step 710, wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors; and (f) outputting results of said evaluation operation in (e)—step 712.
  • In one example, the value of ((VU XOR VL) AND M) is pre-computed, such that only the remaining part of the expression (i.e., (((VU−R) XOR (R−VL)) AND M)) is evaluated on a per-record basis: using two subtractions, an XOR, a bit-wise ANDs and a bitwise comparison (all of which can be performed efficiently on most current processors using, for example, hardware instructions).
  • Variations of the method of FIG. 7 are envisioned and are within the scope of the present invention. For example, method 800 of FIG. 8, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 802; (b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 804; (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector VL having 0s everywhere except one or more of the following: L1 at bit positions [S1, E1], L2 at bit positions [S2, E2] . . . Lk at bit positions [Sk, Ek]—step 806; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having is everywhere except one or more of the following: U1 at bit positions [S1, E1], U2 at bit positions [S2, E2] . . . Uk at bit positions [Sk, Ek]—step 808; (e) for each record, R, on which said predicates need to be applied, evaluating ((VU−R) XOR (R−VL) XOR Vu XOR VL) AND M=0—step 810, wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors; and (f) outputting results of said evaluation operation in (e)—step 812.
  • Another variation of the method of FIG. 7 is shown in FIG. 9. Method 900 of FIG. 9, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) computing Bi=Si−C for k fields, wherein C is a constant whole number (e.g., C=1)—step 902; (b) constructing a mask to extract values of k bits B1, B2, . . . Bk, wherein the mask comprises a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits—step 904; (c) constructing a first value vector containing lower bound values, wherein the first value vector comprises a bit vector VL having 0s everywhere except one or more of the following: L1 at bit positions [S1, E1], L2 at bit positions [S2, E2] . . . Lk at bit positions [Sk, Ek]—step 906; (d) constructing a second value vector containing upper bound values, wherein the second value vector comprises a bit vector Vu having 1s everywhere except one or more of the following: U1 at bit positions [S1, E1], U2 at bit positions [S2, E2] . . . Uk at bit positions [Sk, Ek]—step 908; (e) for each record, R, on which said predicates need to be applied, evaluating ((VU−R) XOR (R−VL) XOR VU XOR VL) OR (NOT M)=(NOT M)—step 910, wherein the AND operator represents bit-wise AND of two bit vectors and the XOR operator represents bit-wise Exclusive OR of two bit vectors; and (f) outputting results of said evaluation operation in (e)—step 912.
  • The benefit of the methods of the second embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • Embodiment 3 Applying a Mixture of In-List Predicates Simultaneously
  • FIG. 10 illustrates an example of the third embodiment's computer-based method to simultaneously evaluating conjunctions of a mixture of in-list predicates on k fields F1, F2, . . . Fk of the form F1 in (L11, L12 . . . L1n) and F2 in (L21, L22 . . . L2n) and . . . Fk in (Lk1, Lk2 . . . Lkn), wherein the k fields are at offsets [S1, E1], [S2, E2] . . . [Sk, Ek].
  • Method 1000 of FIG. 10, as implemented in computer readable program code stored in computer storage, comprises the steps of: (a) constructing a first mask to extract values of k fields, wherein the mask comprises a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits—step 1002; (b) constructing a second mask to extract most significant bit of each field, wherein the second mask comprises a bit vector S having 0s in bits S1, S2, . . . , Sk and having 1s in remainder of bits—step 1004; (c) for each 1 through n, computing a bit vector of values V1, V2, . . . , Vn, wherein Vi has 0s in all bits except values L1i, L2i, . . . . Lki at [S1, E1], [S2, E2] . . . [Sk, Ek], respectively—step 1006; (d) for each record, R, on which said predicates need to be applied, evaluating n numbers as follows—step 1008:
  • N1=((((V1 XOR R) AND S)+S) OR (V XOR R));
  • N2=((((V2XOR R) ANDS)+S) OR (V2XOR R)); . . .
  • Nn=((((Vn XOR R) AND S)+S) OR (V XOR R));
  • and then evaluating the following condition:

  • ((N1 AND N2 AND . . . Nn) OR S)=S
  • wherein said AND operator represents bit-wise AND of two bit vectors, said XOR operator represents bit-wise Exclusive OR of two bit vectors, + represents subtraction, and OR represents bit-wise OR; and (e) outputting results of said evaluation operation in (d)—step 1010.
  • The benefit of the method of the third embodiment is that computation done per record (a bitwise and an equality comparison) is efficiently done (with hardware or software instructions), and takes the same amount of time irrespective of k. This allows for very complex predicates to be evaluated quickly.
  • In one example, the k fields F1, F2, . . . Fk described in the above-mentioned methods associated with embodiments 1 through 3 have a single codeword length.
  • In another example, evaluations described in the above-mentioned methods associated with embodiments 1 through 3 are computed exclusively via processor instructions.
  • In yet another example, the computer-based methods of embodiments 1 through 3 are used in constant-time query processing.
  • Additionally, the present invention provides for an article of manufacture comprising computer readable program code contained within implementing one or more modules to implement each of the above described methods of FIGS. 1 through 10. Furthermore, the present invention includes a computer program code-based product, which is a storage medium having program code stored therein which can be used to instruct a computer to perform any of the methods associated with the present invention. The computer storage medium includes any of, but is not limited to, the following: CD-ROM, DVD, magnetic tape, optical disc, hard drive, floppy disk, ferroelectric memory, flash memory, ferromagnetic memory, optical storage, charge coupled devices, magnetic or optical cards, smart cards, EEPROM, EPROM, RAM, ROM, DRAM, SRAM, SDRAM, or any other appropriate static or dynamic memory or data storage devices.
  • CONCLUSION
  • A system and method has been shown in the above embodiments for the effective implementation of methods for evaluating a conjunction of equity and range predicates using a constant number of operations. While various preferred embodiments have been shown and described, it will be understood that there is no intent to limit the invention by such disclosure, but rather, it is intended to cover all modifications falling within the spirit and scope of the invention, as defined in the appended claims. For example, the present invention should not be limited by software/program, computing environment, or specific computing hardware.

Claims (20)

1. A computer-based method to simultaneously evaluate conjunctions of equality predicates on k fields F1, F2, . . . Fk: F1=L1 and F2=L2 and Fk=Lk of a record, said L1, L2 . . . Lk representing values, and said fields F1, F2, . . . Fk being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within said record, wherein offset [X, Y] represents bits X through Y, said method implemented in computer readable program code stored in computer storage, said method comprising the steps of:
(a) constructing a mask to extract values of said fields F1, F2, . . . Fk within a cell, said mask comprising a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits;
(b) constructing a value vector comprising a bit vector V having said values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits;
(c) for each record, R, on which said equality predicates need to be applied, evaluating if R AND M=V; and
(d) outputting results of said evaluation in (c).
2. The computer-based method of claim 1, wherein:
instead of said bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and 0s in remainder of bits, said bit vector M having 0s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and 1s in remaining bits, and step (c), instead of said bit vector V having said values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits, said bit vector V having said values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 1s in remainder of bits, and instead of evaluating if R AND M=V, evaluating if R OR M=V.
3. The computer-based method of claim 1, wherein, said method in step (c), instead of evaluating if R AND M=V, evaluates if (R XOR V) AND M=0.
4. The computer-based method of claim 1, wherein each of said k fields F1, F2, . . . Fk has a single codeword length.
5. The computer-based method of claim 1, wherein said evaluations are computed exclusively via processor instructions.
6. The computer-based method of claim 1, wherein said computer-based method is used in constant-time query processing.
7. A computer based method to simultaneously evaluate conjunctions of range and equality predicates on k fields of a record being either F1, F2, . . . Fk: F1≦L1 and F2≦L2 and Fk≦Lk, or F1, F2, . . . Fk: F1≧L1 and F2≧L2 and Fk≧Lk, said L1, L2 . . . Lk representing values and said fields F1, F2, . . . Fk being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] within said record, wherein offset [X,Y] represents bits X through Y, said method implemented in computer readable program code stored in computer storage, said method comprising the steps of:
(a) computing Bi=Si−C for said k fields, wherein C is a constant whole number;
(b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, said mask comprising a bit vector M having 1s in bits at k bit positions, B1, B2, . . . , Bk having 0s in remainder of bits;
(c) constructing a value vector containing said values of said fields F1, F2, . . . Fk, said value vector comprising a bit vector V having said values L1, L2 . . . Lk at bit positions [S1, E1], [S2, E2] . . . [Sk, Ek], respectively, and having 0s in remainder of bits;
(d) for each record, R, on which said predicates need to be applied, evaluating as follows:
when F1, F2, . . . Fk: F1≦L1 and F2≦L2 and Fk≦Lk, evaluating ((V−R) AND M)=(V XOR R) AND M), or
when F1, F2, . . . Fk: F1≧L1 and F2≧L2 and Fk≧Lk, evaluating ((R−V) AND A)=(V XOR R) AND M),
wherein said AND operator represents bit-wise AND of two bit vectors and said XOR operator represents bit-wise Exclusive OR of two bit vectors; and
(e) outputting results of said evaluation operation in (d).
8. The computer-based method of claim 7, wherein, said method in step (d), instead of evaluating if ((V−R) AND M)=(V XOR R) AND M), evaluates if (((V−R) XOR V XOR R) AND M)=0.
9. The computer-based method of claim 7, wherein, said method in step (d), instead of evaluating if ((V−R) AND M)=(V XOR R) AND M), evaluates if ((V−R) XOR V XOR R) OR (NOT M)=(NOT M).
10. The computer-based method of claim 7, wherein said wherein C is equal to 1.
11. The computer-based method of claim 7, wherein said evaluations are computed exclusively via processor instructions.
12. The computer-based method of claim 7, wherein said computer-based method is used in constant-time query processing.
13. A computer based method to simultaneously evaluate conjunctions of one or more range or equality predicates on k fields F1, F2, . . . Fk of a record, where each predicate is one of four forms: (i) Li≦Fi or (ii) Fi≦Ui or (iii) Fi=Li or (iv) Li≦F1≦Ui, said L1, L2 . . . Lk representing values, and said k fields being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek] of said record, wherein offset [X, Y] represents bits X through Y, said method implemented in computer readable program code stored in computer storage, said method comprising the steps of:
(a) computing Bi=Si−C for said k fields, wherein C is a constant whole number;
(b) constructing a mask to extract values of k bits B1, B2, . . . , Bk, said mask comprising a bit vector M having 1s in bits at k bit positions, B1, B2, . . . Bk having 0s in remainder of bits;
(c) constructing a first value vector containing lower bound values, said first value vector comprising a bit vector VL having 0s everywhere except one or more of the following: L1 at bit positions [S1, E1], L2 at bit positions [S2, E2] . . . Lk at bit positions [Sk, Ek];
(d) constructing a second value vector containing upper bound values, said second value vector comprising a bit vector Vu having 1s everywhere except one or more of the following: U1 at bit positions [S1, E1], U2 at bit positions [S2, E2] . . . Uk at bit positions [Sk, Ek];
(e) for each record, R, on which said predicates need to be applied, evaluating:
(((VU−R) XOR (R−VL)) AND M)=(VU XOR VL) AND M),
wherein said AND operator represents bit-wise AND of two bit vectors and said XOR operator represents bit-wise Exclusive OR of two bit vectors; and
(f) outputting results of said evaluation operation in (e).
14. The computer-based method of claim 13, wherein, said method in step (e), instead of evaluating if (((VU−R) XOR (R−VL)) AND A)=(VU XOR VL) AND A), evaluates if ((VU−R) XOR (R−VL) XOR VU XOR VL) AND M=0.
15. The computer-based method of claim 13, wherein, said method in step (e), instead of evaluating if (((VU−R) XOR (R−VL)) AND A)=(VU XOR VL) AND A), evaluates if ((VU−R) XOR (R−VL) XOR VU XOR VL) OR (NOT M)=(NOT M).
16. The computer-based method of claim 13, wherein said wherein C is equal to 1.
17. The computer-based method of claim 13, wherein said evaluations are computed exclusively via processor instructions.
18. The computer-based method of claim 13, wherein said method further comprises the step of precomputing (VU XOR VL) AND M, and evaluating remainder of expression in (e) on a per record basis.
19. A computer based method to simultaneously evaluate conjunctions of a mixture of in-list predicates on k fields F1, F2, . . . Fk of the form F1 in (L11, L12 . . . L1n) and F2 in (L21, L22 . . . L2n) and . . . Fk in (Lk1, Lk2 . . . Lkn), said k fields being at offsets [S1, E1], [S2, E2] . . . [Sk, Ek], said method implemented in computer readable program code stored in computer storage, said method comprising the steps of:
(a) constructing a first mask to extract values of k fields, said mask comprising a bit vector M having 1s in bits [S1, E1], [S2, E2] . . . [Sk, Ek] and having 0s in remainder of bits;
(b) constructing a second mask to extract most significant bit of each field, said second mask comprising a bit vector S having 0s in bits S1, S2, . . . , Sk and having 1s in remainder of bits;
(c) for each 1 through n, computing a bit vector of values V1, V2, . . . , Vn, wherein V1 has 0s in all bits except values L1i, L2i, . . . . Lki at [S1, E1], [S2, E2] . . . [Sk, Ek], respectively;
(d) for each record, R, on which said predicates need to be applied, evaluating n numbers as follows:
N1=((((V1 XOR R) AND S)+S) OR (V XOR R));
N2=((((V2 XOR R) ANDS)+S) OR (V2XOR R)); . . .
Nn=((((Vn XOR R) AND S)+S) OR (Vn XOR R));
and then evaluating the following condition:

((N1 AND N2 AND . . . Nn) OR S)=S
wherein said AND operator represents bit-wise AND of two bit vectors, said XOR operator represents bit-wise Exclusive OR of two bit vectors, + represents addition, and OR represents bit-wise OR; and
(e) outputting results of said evaluation operation in (d).
20. The computer-based method of claim 19, wherein said evaluations in step (d) are computed exclusively via processor instructions.
US12/056,999 2008-03-27 2008-03-27 Method for evaluating a conjunction of equity and range predicates using a constant number of operations Expired - Fee Related US7840554B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/056,999 US7840554B2 (en) 2008-03-27 2008-03-27 Method for evaluating a conjunction of equity and range predicates using a constant number of operations

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/056,999 US7840554B2 (en) 2008-03-27 2008-03-27 Method for evaluating a conjunction of equity and range predicates using a constant number of operations

Publications (2)

Publication Number Publication Date
US20090248648A1 true US20090248648A1 (en) 2009-10-01
US7840554B2 US7840554B2 (en) 2010-11-23

Family

ID=41118638

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/056,999 Expired - Fee Related US7840554B2 (en) 2008-03-27 2008-03-27 Method for evaluating a conjunction of equity and range predicates using a constant number of operations

Country Status (1)

Country Link
US (1) US7840554B2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070271305A1 (en) * 2006-05-18 2007-11-22 Sivansankaran Chandrasekar Efficient piece-wise updates of binary encoded XML data
US20130018901A1 (en) * 2011-07-11 2013-01-17 International Business Machines Corporation Search Optimization In a Computing Environment
US8812523B2 (en) * 2012-09-28 2014-08-19 Oracle International Corporation Predicate result cache
US20160019264A1 (en) * 2012-09-13 2016-01-21 International Business Machines Corporation Multiplication-based method for stitching results of predicate evaluation in column stores
US9495466B2 (en) 2013-11-27 2016-11-15 Oracle International Corporation LIDAR model with hybrid-columnar format and no indexes for spatial searches
US9684639B2 (en) 2010-01-18 2017-06-20 Oracle International Corporation Efficient validation of binary XML data
US10756759B2 (en) 2011-09-02 2020-08-25 Oracle International Corporation Column domain dictionary compression

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8832158B2 (en) 2012-03-29 2014-09-09 International Business Machines Corporation Fast predicate table scans using single instruction, multiple data architecture

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5664172A (en) * 1994-07-19 1997-09-02 Oracle Corporation Range-based query optimizer
US5852821A (en) * 1993-04-16 1998-12-22 Sybase, Inc. High-speed data base query method and apparatus
US6115808A (en) * 1998-12-30 2000-09-05 Intel Corporation Method and apparatus for performing predicate hazard detection
US6289335B1 (en) * 1997-06-23 2001-09-11 Oracle Corporation Fast refresh of snapshots containing subqueries
US6334125B1 (en) * 1998-11-17 2001-12-25 At&T Corp. Method and apparatus for loading data into a cube forest data structure
US6381616B1 (en) * 1999-03-24 2002-04-30 Microsoft Corporation System and method for speeding up heterogeneous data access using predicate conversion
US6748392B1 (en) * 2001-03-06 2004-06-08 Microsoft Corporation System and method for segmented evaluation of database queries
US20050187898A1 (en) * 2004-02-05 2005-08-25 Nec Laboratories America, Inc. Data Lookup architecture
US20060224542A1 (en) * 2005-03-16 2006-10-05 Aravind Yalamanchi Incremental evaluation of complex event-condition-action rules in a database system
US7313554B2 (en) * 2003-09-29 2007-12-25 International Business Machines Corporation System and method for indexing queries, rules and subscriptions

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005044303A (en) 2003-07-25 2005-02-17 National Institute Of Advanced Industrial & Technology Automatic theorem proving device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5852821A (en) * 1993-04-16 1998-12-22 Sybase, Inc. High-speed data base query method and apparatus
US5664172A (en) * 1994-07-19 1997-09-02 Oracle Corporation Range-based query optimizer
US6289335B1 (en) * 1997-06-23 2001-09-11 Oracle Corporation Fast refresh of snapshots containing subqueries
US6334125B1 (en) * 1998-11-17 2001-12-25 At&T Corp. Method and apparatus for loading data into a cube forest data structure
US6115808A (en) * 1998-12-30 2000-09-05 Intel Corporation Method and apparatus for performing predicate hazard detection
US6381616B1 (en) * 1999-03-24 2002-04-30 Microsoft Corporation System and method for speeding up heterogeneous data access using predicate conversion
US6748392B1 (en) * 2001-03-06 2004-06-08 Microsoft Corporation System and method for segmented evaluation of database queries
US20050097100A1 (en) * 2001-03-06 2005-05-05 Microsoft Corporation System and method for segmented evaluation of database queries
US7313554B2 (en) * 2003-09-29 2007-12-25 International Business Machines Corporation System and method for indexing queries, rules and subscriptions
US20050187898A1 (en) * 2004-02-05 2005-08-25 Nec Laboratories America, Inc. Data Lookup architecture
US20060224542A1 (en) * 2005-03-16 2006-10-05 Aravind Yalamanchi Incremental evaluation of complex event-condition-action rules in a database system

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070271305A1 (en) * 2006-05-18 2007-11-22 Sivansankaran Chandrasekar Efficient piece-wise updates of binary encoded XML data
US9460064B2 (en) 2006-05-18 2016-10-04 Oracle International Corporation Efficient piece-wise updates of binary encoded XML data
US9684639B2 (en) 2010-01-18 2017-06-20 Oracle International Corporation Efficient validation of binary XML data
US20130018901A1 (en) * 2011-07-11 2013-01-17 International Business Machines Corporation Search Optimization In a Computing Environment
US20130066825A1 (en) * 2011-07-11 2013-03-14 International Business Machines Corporation Search optimization in a computing environment
US8819037B2 (en) * 2011-07-11 2014-08-26 International Business Machines Corporation Search optimization in a computing environment
US8832144B2 (en) * 2011-07-11 2014-09-09 International Business Machines Corporation Search optimization in a computing environment
US10756759B2 (en) 2011-09-02 2020-08-25 Oracle International Corporation Column domain dictionary compression
US20160019264A1 (en) * 2012-09-13 2016-01-21 International Business Machines Corporation Multiplication-based method for stitching results of predicate evaluation in column stores
US10296619B2 (en) * 2012-09-13 2019-05-21 International Business Machines Corporation Multiplication-based method for stitching results of predicate evaluation in column stores
US8812523B2 (en) * 2012-09-28 2014-08-19 Oracle International Corporation Predicate result cache
US9495466B2 (en) 2013-11-27 2016-11-15 Oracle International Corporation LIDAR model with hybrid-columnar format and no indexes for spatial searches

Also Published As

Publication number Publication date
US7840554B2 (en) 2010-11-23

Similar Documents

Publication Publication Date Title
US7840554B2 (en) Method for evaluating a conjunction of equity and range predicates using a constant number of operations
US9875280B2 (en) Efficient partitioned joins in a database with column-major layout
US20070156734A1 (en) Handling ambiguous joins
US9892117B2 (en) Optimizing relational database queries with multi-table predicate expressions
AU2006262110B2 (en) Aggregating data with complex operations
US9754010B2 (en) Generation of cube metadata and query statement based on an enhanced star schema
US7127467B2 (en) Managing expressions in a database system
US20150302058A1 (en) Database system with highly denormalized database structure
US10678794B2 (en) Skew detection and handling in a parallel processing relational database system
US20160253390A1 (en) On-the-fly encoding method for efficient grouping and aggregation
US8135738B2 (en) Efficient predicate evaluation via in-list
US9582553B2 (en) Systems and methods for analyzing existing data models
US11416488B2 (en) SQL double counting resolver
US8504598B2 (en) Data perturbation of non-unique values
Jensen et al. Frequent itemset counting across multiple tables
KR20150079689A (en) Profiling data with source tracking
US11222015B2 (en) Helper scan in a database management system
Martins et al. Comparing oracle and postgresql, performance and optimization
US9069837B2 (en) Encapsulation of multiplicity and sparsity in multidimensional query execution systems
US9846712B2 (en) Index-only multi-index access
US11416485B2 (en) Dynamic query expressions
Suh et al. A new tool for multi-level partitioning in teradata
US20190163788A1 (en) Value list compression (vlc) aware qualification
Bouchakri et al. Static and incremental selection of multi-table indexes for very large join queries
US11899662B1 (en) Compression aware aggregations for queries with expressions

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:JOHNSON, F RYAN;RAMAN, VIJAYSHANKAR;SWART, GARRET FREDERICK;REEL/FRAME:020714/0226;SIGNING DATES FROM 20080311 TO 20080325

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 4

SULP Surcharge for late payment
FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.)

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20181123