WO2015172533A1

WO2015172533A1 - Database query method and server

Info

Publication number: WO2015172533A1
Application number: PCT/CN2014/090240
Authority: WO
Inventors: 李伟鑫; 吴继敏; 蔡利元; 刘成华
Original assignee: 华为技术有限公司
Priority date: 2014-05-12
Filing date: 2014-11-04
Publication date: 2015-11-19
Also published as: CN103970870A

Abstract

Embodiments of the present invention provide a method for querying a columnar memory database. The method comprises: receiving a query request; parsing the query request to acquire a query condition; determining a target column related to the query condition from a columnar memory database; and cyclically performing the following steps till the query of all rows is completed: starting from an [m*(i-1)+1]th row of each target column, storing, into a Cache, data queried in consecutive m rows or data queried in less than m rows at last in all the target columns in the columnar memory database, querying the data stored in the Cache according to the query condition, sending sectionalized query results, and releasing the storage space of the Cache, wherein m is a natural constant, and i is a variable representing the cycled number of times. The method of the present invention reduces data overflow in the query process, and shortens delay of data query.

Description

Database query method and server

Technical field

The embodiments of the present invention relate to database technologies, and in particular, to a database query method and a server.

Background technique

The cache Cache is a temporary storage located between the CPU and the memory RAM. Its capacity is smaller than the memory but the exchange speed is fast. When the CPU calls a large amount of data, it can be called directly from the Cache, thus speeding up the reading. Usually, the Cache is divided into several levels: L1, L2, and L3. L1 is a single-core exclusive use space for a single CPU, L2 is a space for multi-core shared by a single CPU, and L3 is a space shared by multiple CPUs and multiple cores, for example, The sizes of the Caches of L1, L2, and L3 are 32K, 512K, and 15360K, respectively. Taking 256 Bytes per read as an example, the delays of the three CACHEs are: 1.2 ns, 4 ns, and 30 ns, respectively. In the memory RAM, data reading takes 100 ns. In the case of the same command, the data is in different positions, and the time required is several or ten times different. Although the chips of different manufacturers are not the same, the multiples of the delay differences are basically the same.

The columnar in-memory database is a database in which the index is organized by columns. The data of each column is stored closely and closely. This storage organization is particularly suitable for analyzing a small number of columns of large data in the analysis scenario. When querying, the system only reads the records that need to be processed. Columns, not all data columns.

The current column processing is a column decomposition condition. When querying, each column is queried separately, and then the qualified record rows are merged to find the intersection, and the final result is obtained. For example, SELECT COUNT(*), SUM(v.SALARY)FROM(SELECT AGE,SALARY,CITY,JOB FROM T WHERE AGE>24AND SALARY>5000AND CITY='SHENZHEN'AND JOB='SALE'); Columnar In-Memory Database When executing this SQL, it is usually broken down into the following steps:

1. Scan the AGE column to meet the AGE>24 conditional record line number set and data set;

2. Scan the SALARY column to meet the SALARY>5000 conditional record line number set and data set;

3. Scan the set of record line numbers and data sets of the CITY column that meet the CITY=’SHENZHEN’ condition;

4. Scan the JOB column to meet the JOB='SALE' condition of the record line number set and data set;

5. Intersect the set of line numbers of AGE, SALARY, CITY, and JOB to obtain a final set of line numbers satisfying one condition;

6. Count the final set of line numbers and scan the intermediate results to get the COUNT and SUM results.

Each of the above steps, especially when scanning the column, generates a large number of line numbers and data sets. The intermediate data may be caused by the Cache's limited capacity, which may cause the Cache to save the data. In RAM. If the intermediate data is further calculated from the subsequent operations, the intermediate data needs to be taken out from the RAM and calculated. The process of reading data from the RAM will generate a large delay. If the calculation becomes complicated, the delay will increase exponentially.

Summary of the invention

The embodiment of the invention provides a database query method and a server, which reduces the delay of the database query process.

In a first aspect, the present invention provides a database query method, including: receiving a query request; parsing a query request, obtaining a query condition; determining a target column related to the query condition from the columnar in-memory database; performing the following steps in a loop until Completing the query for all rows: starting from the m*(i-1)+1 row of each target column, storing data of consecutive m rows or data of less than m rows in each target column in the columnar in-memory database to Cache Cache, according to the query condition, query the data stored by the Cache, send the segmentation query result, and release the storage space of the Cache, where m is a natural constant, and i is a variable, indicating the number of times the loop has been executed.

In conjunction with the first aspect, in a first possible implementation manner of the first aspect, the method further includes: when querying the last m rows or less than m rows of data, the segmented query result further includes a query End the identification to inform the requester that the query request has been completed.

In combination with the first possibility of the first aspect, in a second possible implementation manner of the first aspect, the method further includes: after receiving the segmentation query result, the requester stores each segment query result, when receiving the query After the identification is completed, the results of each segment query are combined, and the combined query result is used as the final query result.

In conjunction with the first possibility of the first aspect, in a third possible implementation of the first aspect, when the data of a row from the target column in the columnar in-memory database is stored in the cache, the in-column in-memory database Get the last row identifier of the target column or the ground of the next row of the target column When the address or pointer is used, the query end identifier is generated.

In a second aspect, the present invention provides a database query server, including: a receiving module, a parsing module, a determining module, and a query module, wherein: a receiving module is configured to receive a query request sent by a requester, where the query request includes a query condition, and Sending the query request to the parsing module; the parsing module is configured to receive the query request sent by the receiving module, parse the query request, obtain the query condition, send the query condition to the determining module and the query module, and determine the module for receiving Parsing the query condition sent by the module, and determining the target column related to the query condition from the columnar in-memory database; and the query module, which is used to loop the following steps until the query of all rows is completed: from the mth of each target column *(i-1) +1 line starts, storing data of consecutive m rows or data of less than m rows in each target column in the columnar in-memory database to the cache cache, and querying the data stored by the cache according to the query condition. Send the segmentation query result and release the storage space of the Cache, where m is a natural constant, i is a variable, indicating that The number of times the loop is executed.

With reference to the second aspect, in a first possible implementation manner of the second aspect, the query module is specifically configured to send a segment query including an end of the query when querying the last m rows or less than m rows of data As a result, the requester is notified that the request has been completed this time.

With reference to the second aspect, in a second possible implementation manner of the second aspect, the database query server further includes: a generating module, configured to: when storing data of a row from the target column in the columnar in-memory database into the cache The query end identifier is generated when the last row identifier of the target column is obtained in the columnar in-memory database or the address or pointer of the next row of the target column cannot be obtained.

In a third aspect, the present invention provides a method for querying a columnar in-memory database, the method comprising:

Receiving a query request; parsing the query request, obtaining the query condition; determining the target column related to the query condition from the column in-memory database; performing the following steps in a loop until the query of all rows is completed: from the m* of each target column -1) Starting at +1 line, storing data of consecutive m rows or data of less than m rows in each target column in the columnar in-memory database to the cache cache, and querying the data stored by the cache according to the query condition, in temporary storage The module stores the segmentation query result, and releases the storage space of the Cache, where m is a natural constant, i is a variable, indicating the number of times the loop has been executed; and when the data of the target column related to the query condition is all queried, the temporary storage module is combined. The result of each segment query is used, and the combined query result is used as the final query result, and the final query result is sent.

In conjunction with the third aspect, in a first possible implementation manner of the third aspect, the method further includes: when storing data of a row from a target column in the columnar in-memory database, to the cache, and obtaining the data in the columnar in-memory database When the last line identifier of the target column is reached, or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated to notify the temporary storage module to combine the result of each segment query in the temporary storage module.

In a fourth aspect, the present invention provides a database query server, including: a receiving module, a parsing module, a determining module, a query module, and a temporary storage module, wherein: the receiving module is configured to receive a query request sent by the requester, where the query request includes a query condition And sending the query request to the parsing module; the parsing module is configured to receive the query request sent by the receiving module, parse the query request, obtain the query condition, and send the query condition to the determining module and the query module; Receiving the query condition sent by the parsing module, and determining a target column related to the query condition from the column in-memory database; and a query module for performing the following steps cyclically until all rows are completed: from each target column Starting from the m*(i-1)+1 line, the data of consecutive m rows or the data of the last less than m rows in each target column in the columnar in-memory database is stored in the cache cache, and the cache is stored according to the query condition. Data, storing the segmentation query result in the temporary storage module, releasing the storage space of the Cache, where m is a Constant, i is a variable, indicating the number of times the loop has been executed; the temporary storage module is used to store the result of the segmentation query during the execution of the loop. When the data of the target column related to the query condition is queried, the temporary storage module is combined. The result of each segment query is used as the final query result and the final query result is sent.

With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the database query server further includes: a generating module, configured to: store data of a row from the target column in the columnar in-memory database into the cache When the last row identifier of the target column is obtained in the columnar in-memory database, or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated to notify the temporary storage module to combine the segments in the temporary storage module. search result.

In the embodiment of the present invention, since the data of the query is all queried for a period of time, the data generated during the query process and the data of the query can only be stored in the Cache, and does not overflow into the RAM, thereby reducing the query. Data overflow in the process reduces the latency of data queries.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.

1 is a schematic flowchart of a database query method according to an embodiment of the present invention;

2 is a schematic flowchart of another method for querying a database according to an embodiment of the present invention;

3 is a schematic diagram of a query process of a database query method according to an embodiment of the present invention;

4 is a schematic structural diagram of a database query server according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of still another database query server according to an embodiment of the present invention; FIG.

6 is a schematic structural diagram of still another database query server according to an embodiment of the present invention;

FIG. 7 is a schematic structural diagram of still another database query server according to an embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments.

1 is a flowchart of an embodiment of a database query method according to the present invention. The method is applied to a columnar in-memory database, and the columnar data of the columnar in-memory database is stored in the memory, and the method includes:

Step 101: Receive a query request.

The external device can directly send a query request to the database query server. At this time, the requester is an external device, and the external device can also send a query request to the database query server through the proxy device. At this time, the requester is a proxy device, and the proxy device can be located in the database. In the query server, it can also be located outside the database query server, and the query request contains the query conditions.

Step 102: Parse the query request and obtain the query condition.

The database query server obtains the query condition according to the query request. The query condition may be a SQL statement, such as: SELECT NAME FROM DETAIL_RECORD WHERE FEE>100AND AGE>18.

Step 103: Determine, from the columnar in-memory database, a target column related to the query condition;

The column involved in the query condition is the target column. For example, "FEE>100AND AGE>18" in the query condition involves two columns, FEE column and AGE column, then these two columns are the target columns related to the query condition.

Step 104: Loop through the following steps until the query of all rows is completed: starting from the m*(i-1)+1 row of each target column, the data of consecutive m rows in each target column in the columnar in-memory database or The data of the last less than m rows is stored in the Cache. According to the query condition, the data stored by the Cache is queried, the segmentation query result is sent, and the storage space of the Cache is released, where m is a natural constant, and m represents a target column for each query. The number of lines, m is 1000-10000, especially 8192 lines, i is a variable, indicating the number of times the loop has been executed. For example, i is 1 when the first loop is executed, and i is 2 when the second loop is executed.

2 is a flowchart of an embodiment of a database query method according to the present invention. The method is applied to a columnar in-memory database, and the columnar data of the columnar in-memory database is stored in the memory, and the method includes:

Steps 201-203 are the same as steps 101-103.

Step 204: Perform the following steps cyclically until the query of all the rows is completed: starting from the m*(i-1)+1 row of the respective target columns, the respective target columns in the columnar in-memory database are consecutively m The data of the row or the data of the last less than m rows is stored in the cache cache, and the data stored in the cache is queried according to the query condition, and the segment query result is stored in the temporary storage module, and the storage space of the cache is released, wherein m is a natural constant, m is the number of rows per query target column, i is a variable, indicating the number of times the loop has been executed;

Step 205: After the data of the target column related to the query condition is all queried, combine the result of each segment query in the temporary storage module, and use the combined query result as a final query result, and send the final query result.

For example, the SQL statement in step 102 needs to query two columns, FEE column and AGE column, and according to the preset setting, the number of rows of the target column is 8192 rows each time, that is, m is 8192, and the column is first The first consecutive 8192 rows of data in the FEE column of the in-memory database are stored in the Cache, and the operation is performed according to the query condition and the 8192 rows of data stored in the cache. The first intermediate operation result is obtained, and the first intermediate operation result is stored in the Cache. Then storing the first consecutive 8192 rows of data in the AGE column of the columnar in-memory database to the Cache, and performing operations according to the query condition and the 8192 rows of data stored in the Cache to obtain a second intermediate operation result, and the first The result of the second intermediate operation is stored in the Cache. The result of each intermediate operation can be a bitmap.

Then, according to the query condition, the first intermediate operation result and the second intermediate operation result, the operation is performed to obtain the segmentation query result of the query, and then the segment query result is sent to the requester or to the temporary storage module, and the result is released. The storage space of the Cache, so that the Cache space can be freed for the next segmentation query.

When the next sub-segment query is executed, the 8192*(i-1)+1 row starts, and the data of the continuous 8192 rows in the target column is read from the columnar in-memory database into the Cache, and the result of the intermediate operation and the query is obtained. The way each loop executes is the same. There are many ways to read the data of a row from the columnar in-memory database. For example, when reading a row of data, you will get the next row of data pointers or addresses of the row, that is, if the i-th segment is executed. When the query is performed, the pointer or address of the first row of the i-th segment query is obtained from the last row of the i-1th segment query, and then the obtained i-th segment query is obtained. The pointer or address of the first line. Or each line has a line number, which line needs to be read, and is read by the line number.

When the number of rows of the unqueried row in each target column is less than m (8192) rows, the data of the last unqueried row of the FEE column in the columnar in-memory database is stored in the cache, and the storage is performed according to the query condition and the Cache. Finally, the data of the row is not queried, and the operation is performed to obtain the first intermediate operation result of the query, and the first intermediate operation result of the query is stored in the Cache. Then, the data of the last Query row of the AGE column in the columnar in-memory database is stored in the Cache, and the operation is performed according to the query condition and the data of the last unqueried row stored by the Cache, and the second intermediate operation result of the query is obtained. And storing the second intermediate operation result of the query in the Cache. According to the query condition, the first intermediate operation result of the query and the second intermediate operation result of the current query, the operation is performed to obtain the segmentation query result of the query, and then sent to the requester or to the temporary storage module. The result of the segmentation query releases the storage space of the Cache.

When the data of a row of the target column in the columnar in-memory database is stored in the cache, the last row identifier of the target column is obtained in the columnar in-memory database or the address of the next row of the target column cannot be obtained. Or a pointer, a query end identifier is generated. During the query process, there are multiple ways to determine whether a row of the query is the last row. For example, if the number of rows in the target column is known in advance, the number of rows that have been queried is counted in the query process to know whether the row of the query is the last. One line, end the query. Or, the last row in each column has a last row identifier, which indicates that the row is the last row of the column, so when the last row identifier is obtained in the columnar in-memory database during the query, it can be known This line is the last line and ends the query. Or, when querying each row, you will get the pointer or address of the next row. If you can't get the pointer or address of the next row in the columnar in-memory database when querying a row, it means that the last row of the query is finished. Inquire.

The database query server can directly send each segment query result to the external device or the proxy device. If directly sent to the proxy device, the proxy device can directly send each segment query result directly to the external device, and the proxy device can also directly The segmentation query result is stored. When the query end flag is received, all segmentation query results are combined according to the query end flag, and the combined segmentation query result is sent to the external device. The database query server can also store the results of each segment query in the temporary storage module in the database query server. When the query end flag is obtained, it means that all the segmentation queries are finished, so all segmentation query results are combined, and Send the combined segmentation query result to the requester.

Since there are many rows in the column data of the columnar in-memory database, all the rows are queried in one time, which causes the data of the Cache to be stored in the RAM, which overflows the RAM, which brings about the delay of the query. In the embodiment of the present invention, The data of the query is queried for a period of time, so the data generated during the query and the data of the query can only be stored in the Cache, and does not overflow into the RAM, thus reducing the data overflow during the query process, reducing the The delay of the data query.

An example will be described below to explain the above embodiments of the present invention.

The database query server receives an SQL query statement sent by the requester. For example, the SQL query statement is: SELECT NAME FROM DETAIL_RECORD WHERE FEE>100AND AGE>18. The database query server process is shown in Figure 3.

1. After the database query server receives the SQL statement, parse the SQL statement and confirm the SQL. The statement is related to the FEE column and the AGE column in the DETAIL_RECORD table.

2. The database query server stores the data of the FEE column for a continuous 8192 rows to the Cache, and performs an operation according to the expression 'FEE>100' and the data of the 8192 rows stored by the Cache, and generates a first bitmap of 8192 rows. (bitmap). The row identifier in the first bitmap that meets the FEE>100 condition is 1, and the row identifier that does not meet the FEE>100 condition is 0, and the first bitmap is stored in the Cache. The specific Cache above may be an L2 level Cache.

In the processing of column data, the general data types are BOOLEAN, INT8, INT16, INT32, INT64, DOUBLE, VARCHAR type, the minimum data type is 1bit, the common numeric type is no more than 8 bytes, and the character type is 30 words. Within the festival. Because of an 8-byte data, 8192 lines is also 64KB. Considering the result of the operation, a bitmap is 8K. The data and processing intermediate results are all within the L2 level CACHE, including the program code segment, so that the CPU can be well utilized. Low delay characteristic of L2 class.

3. Then store the first consecutive 8192 rows of data in the AGE column of the DETAIL_RECORD table to the Cache, and perform operations according to the expression AGE>18 and the 8192 rows of data stored in the Cache to generate a second row of 8192 rows. The bitmap, the row identifier of the second bitmap corresponding to the AGE>18 condition is 1, the row identifier that does not meet the AGE>18 condition is 0, and the second bitmap is stored in the Cache.

4. Next, according to the expression FEE>100AND AGE>18, the first bitmap and the second bitmap, perform an AND operation, generate a third bitmap of 8192 rows, and store the third bitmap in the level. Cache.

5, according to the third bitmap and the expression SELECT NAME FROM DETAIL_RECORD, extract the corresponding NAME column data from the DETAIL_RECORD table, thus obtaining the result of the segmentation query, and then sending the segmentation query result and releasing the Cache to the requester. Storage space, so that you can free up the Cache space for the next segmentation query.

Loop through the above process 2-5, once per loop, query a segment until all rows have been processed. The database query server includes the processing end identifier in the last message that returns to the external device.

When the above steps 2-5 are executed cyclically, if the segmentation query is performed for the first time, in

steps

2 and 3, the database query server stores the first consecutive 8192 rows of data in the FEE column to the Cache, if it is the i-th The segment query is executed one time. In

steps

2 and 3, the database query server stores the data of the (i)th consecutive 8192 rows that are not queried in the FEE column into the Cache. If the segmentation query is executed last time, in

steps

2 and 3, the database query server stores the data of all the remaining rows of the FEE column that have not been queried into the Cache.

For example, in this implementation scenario, a database query of 1 billion record data tables is taken as an example, and the amount of data to be evaluated is as follows:

The results of the comparative analysis of delays are as follows: (assuming the city is 300)

Among them, due to CACHE overflow to RAM delay 11.38-6.63 = 4.75s, accounting for 41.7% of the query.

Since the segment data is used for the table data, the Cache can store the fixed length bitmap instead of the line number set or the excessively long bitmap that satisfies the filtering condition in the traditional scheme. All intermediate results of the query process are calculated in the L2 level Cache. Processing, and will not overflow into the RAM, thus greatly shortening the processing delay of the system and improving the efficiency of the query.

FIG. 4 is a schematic structural diagram of a database query server 400 according to an embodiment of the present invention, including: a processor 402, a memory 406, an input/output interface 408, a network interface 408, and a bus 412. Bus 412 can include a path for communicating information between various components of the database query server. The processor 402 is configured to process information, execute instructions or operations, and may be a general purpose central processing unit (CPU), a microprocessor, an application specific integrated circuit (ASIC), or one or more for controlling An integrated circuit executed by the program of the present invention. The database query server also includes one or more memories 406 for storing information and instructions, such as: database query server data, the memory may be read-only memory (ROM) or other types that can store static information and instructions. A static storage device, random access memory (RAM) or other type of dynamic storage device that can store information and instructions, or disk storage. These memories are coupled to processor 402 via bus 412. The processor 402 further includes a cache cache 404. The cache 404 is configured to obtain a piece of data of the columnar in-memory database from the memory 406. The processor 402 can directly obtain the segment of the cache 404 and perform a segmentation query process. The process data is also stored in the cache 404. After each segment of the query process ends, the stored content in the cache 404 is cleared, and the storage space of the cache 404 is released.

Input and output interface 408 can include an input device or an output device. The input device is configured to receive data and information input by the user, such as a keyboard, a mouse, a camera, a scanner, a light pen, a voice input device, a touch screen, and the like. The output device is used to allow output or display of information to the user, including display screens, printers, speakers, and the like. The database query server also includes a network interface 410 that uses devices such as any transceiver to communicate with other devices or communication networks, such as Ethernet, Radio Access Network (RAN), Wireless Local Area Network (WLAN), and the like. The processor 402 can also be coupled to the input and output interface 408, the network interface 410 via the bus 412.

For the specific processing procedure of the database query server processor 402 and the cache 404, refer to the method embodiment shown in FIG.

As shown in FIG. 5, an embodiment of the present invention further discloses a database query server, which includes a receiving module 51, a parsing module 52, a determining module 53, and a query module 54, wherein:

The receiving module 51 is configured to receive a query request sent by the requester, where the query request includes a query condition, and send the query request to the parsing module 52;

The parsing module 52 is configured to receive the query request sent by the receiving module 51, parse the query request, obtain the query condition, and send the query condition to the determining module 53 and the query module 54;

a determining module 53, configured to receive a query condition sent by the parsing module, and determine, from a column in-memory database, a target column related to the query condition;

The query module 54 is configured to cyclically execute the following steps until the query of all rows is completed: starting from the m*(i-1)+1 row of each target column, the respective target columns in the columnar in-memory database The data of the consecutive m rows or the data of the last less than m rows is stored in the cache cache, and according to the query condition, the data stored by the cache is queried, the query result is sent, and the storage space of the cache is released, where m is a natural The constant, i is a variable, indicating the number of times the loop has been executed.

Further, as shown in FIG. 6, the database query server further includes a generating module 55, configured to be used in the columnar in-memory database when storing data of a row of the target column in the columnar in-memory database into the cache. When the last row identifier of the target column is obtained or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated. Further, the query module 54 is specifically configured to: when querying the last m rows or less than m rows of data, sending the inclusion The query ends the query result of the identification to notify the requester that the current query request has been completed.

As shown in FIG. 7, another embodiment of the present invention further discloses a database query server, including a receiving module 61, a parsing module 62, a determining module 63, a query module 64, and a temporary storage module 65, wherein:

The receiving module 61 is configured to receive a query request sent by the requester, where the query request includes a query condition, and send the query request to the parsing module 62;

The parsing module 62 is configured to receive the query request sent by the receiving module, parse the query request, obtain the query condition, and send the query condition to the determining module 63 and the query module 64;

a determining module 63, configured to receive a query condition sent by the parsing module 62, and determine, from the column in-memory database, a target column related to the query condition; and

The query module 64 is configured to perform the following steps in a loop until all the rows of the query are completed: starting from the m*(i-1)+1 row of each target column, the m rows in each target column in the columnar in-memory database The data or the last less than m rows of data is stored in the cache cache, according to the query conditions, query the data stored by the cache, store the query result in the temporary storage module, and release the storage space of the cache, where m is a natural constant, i is a variable , indicating the number of times the loop has been executed;

The temporary storage module 65 is configured to store the query result during the execution of the loop. When the data of the target column related to the query condition is all queried, the respective query results in the temporary storage module are combined, and the combined query result is used as the final query result. , send the final query result.

Further, the database query server further includes: a generating module 66, configured to: when the data of a row of the target column in the columnar in-memory database is stored into the cache, obtain the target column in the columnar in-memory database When the last line identifier or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated to notify the temporary storage module to combine the respective query results in the temporary storage module.

Since the database query server of the present invention uses segmentation query for the table data, the cache can store the fixed length bitmap instead of the line number set or the excessively long bitmap that satisfies the filtering condition in the traditional scheme, and all intermediate results of the query process are in the Cache calculation processing without overflowing In RAM, the processing delay of the system is greatly shortened, and the efficiency of the query is improved.

One of ordinary skill in the art will appreciate that all or part of the steps to implement the various method embodiments described above may be accomplished by hardware associated with the program instructions. The aforementioned program can be stored in a computer readable storage medium. The program, when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims

A method for querying a columnar in-memory database, the method comprising:

Receiving a query request;

Parsing the query request to obtain a query condition;

From the columnar in-memory database, determine the target column associated with the query condition;

Looping through the following steps until the query of all rows is completed: starting from the m*(i-1)+1 row of the respective target columns, the data of consecutive m rows in the respective target columns in the columnar in-memory database will be Or the data of the last less than m rows is stored in the cache cache, and the data stored by the cache is queried according to the query condition, and the segment query result is sent to release the storage space of the cache, where m is a natural constant, i is A variable indicating the number of times the loop has been executed.
The method according to claim 1, wherein the method further comprises: when querying the last m rows or less than m rows of data, the segmentation query result further includes a query end identifier to notify The requester has completed the query request this time.
The method according to claim 2, wherein the method further comprises: after the requester receives the respective segment query results, storing the segmentation query results, and after receiving the query end identifier, the combination office The result of each segment query is described, and the combined query result is taken as the final query result.
The method of claim 2, wherein the method further comprises:

When the data of a row of the target column in the columnar in-memory database is stored in the cache, the last row identifier of the target column is obtained in the columnar in-memory database or the address of the next row of the target column cannot be obtained. Or a pointer, a query end identifier is generated.
The method according to claim 1 or 2, wherein when the Cache is at the L2 level, the m is 1000-10000.
A database query server, comprising: a receiving module, a parsing module, a determining module and a query module, wherein:

The receiving module is configured to receive a query request sent by a requester, where the query request includes a query condition, and send the query request to the parsing module;

The parsing module is configured to receive a query request sent by the receiving module, parse the query request, obtain a query condition, and send the query condition to the determining module and the query module;

The determining module is configured to receive a query condition sent by the parsing module, and determine, from a column in-memory database, a target column related to the query condition; and

The query module is configured to cyclically execute the following steps until the query of all rows is completed: starting from the m*(i-1)+1 row of each target column, the respective targets in the column in-memory database are to be The data of consecutive m rows in the column or the data of the last less than m rows is stored in the cache cache, and the data stored by the cache is queried according to the query condition, the segment query result is sent, and the storage space of the cache is released, where m Is a natural constant, i is a variable, indicating the number of times the loop has been executed.
The database query server according to claim 6, wherein the query module is configured to send a segment query result including the query end identifier when querying the last m rows or less than m rows of data. To notify the requester that the query request has been completed.
The database query server according to claim 6, further comprising: a generating module, configured to store the data of the row of the target column in the columnar in-memory database into the cache, in the columnar in-memory database When the last row identifier of the target column is obtained or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated.
A method for querying a columnar in-memory database, the method comprising:

Receiving a query request;

Parsing the query request to obtain a query condition;

From the columnar in-memory database, determine the target column associated with the query condition;

Loop through the following steps until you complete the query for all rows: from the respective target columns Starting with m*(i-1)+1, storing data of consecutive m rows or data of less than m rows in the respective target columns in the columnar in-memory database to the cache cache, according to the query condition, query The data stored in the cache stores the segmentation query result in the temporary storage module, and releases the storage space of the cache, where m is a natural constant, and i is a variable, indicating the number of times that the loop has been executed;

After the data of the target column related to the query condition is all queried, the result of each segment query in the temporary storage module is combined, and the combined query result is used as a final query result, and the final query result is sent.
The method of claim 9 wherein the method further comprises:

When the data of a row of the target column in the columnar in-memory database is stored in the cache, the last row identifier of the target column is obtained in the columnar in-memory database, or the address of the next row of the target column cannot be obtained. Or a pointer, generating a query end identifier to notify the temporary storage module to combine the respective segment query results in the temporary storage module.
A database query server, comprising: a receiving module, a parsing module, a determining module, a query module and a temporary storage module, wherein:

The receiving module is configured to receive a query request sent by a requester, where the query request includes a query condition, and send the query request to the parsing module;

The parsing module is configured to receive a query request sent by the receiving module, parse the query request, obtain a query condition, and send the query condition to the determining module and the query module;

The determining module is configured to receive a query condition sent by the parsing module, and determine, from a column in-memory database, a target column related to the query condition; and

The query module is configured to cyclically execute the following steps until the query of all rows is completed: starting from the m*(i-1)+1 row of each target column, the respective targets in the column in-memory database are to be The data of consecutive m rows in the column or the data of the last less than m rows is stored in the cache cache, and the data stored by the cache is queried according to the query condition, and the data is stored in the temporary storage module. Segment query result, release the storage space of the Cache, where m is a natural constant, i is a variable, indicating the number of times that the loop has been executed;

The temporary storage module is configured to store a segmentation query result in a loop execution process, and after the data of the target column related to the query condition is all queried, combine the segmentation query results in the temporary storage module, and combine the results. The result of the query is sent as the final query result, and the final query result is sent.
The database query server according to claim 11, further comprising: a generating module, configured to: store data from a row of the target column in the columnar in-memory database into the cache, and obtain the data in the columnar in-memory database When the last row identifier of the target column is obtained, or the address or pointer of the next row of the target column cannot be obtained, a query end identifier is generated to notify the temporary storage module to combine the segment query results in the temporary storage module. .