CN109299101B - Data retrieval method, device, server and storage medium - Google Patents

Data retrieval method, device, server and storage medium Download PDF

Info

Publication number
CN109299101B
CN109299101B CN201811196264.3A CN201811196264A CN109299101B CN 109299101 B CN109299101 B CN 109299101B CN 201811196264 A CN201811196264 A CN 201811196264A CN 109299101 B CN109299101 B CN 109299101B
Authority
CN
China
Prior art keywords
partition
data retrieval
data
retrieved
condition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811196264.3A
Other languages
Chinese (zh)
Other versions
CN109299101A (en
Inventor
朱仲颖
万伟
韩朱忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Dameng Database Co Ltd
Original Assignee
Shanghai Dameng Database Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Dameng Database Co Ltd filed Critical Shanghai Dameng Database Co Ltd
Priority to CN201811196264.3A priority Critical patent/CN109299101B/en
Publication of CN109299101A publication Critical patent/CN109299101A/en
Application granted granted Critical
Publication of CN109299101B publication Critical patent/CN109299101B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data retrieval method, a data retrieval device, a server and a storage medium, wherein data retrieval conditions are obtained; determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and data retrieval conditions; the data retrieval result is determined, the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved, and the data retrieval result is determined in the retrieval partition according to the data retrieval condition and the partition characteristics after the retrieval partition is determined according to the data retrieval condition and the partition characteristics, so that the problem that each piece of data in the partition to be retrieved needs to be matched with the retrieval condition in the prior art is solved, the time cost of data retrieval is reduced, and the query efficiency is improved.

Description

Data retrieval method, device, server and storage medium
Technical Field
The embodiment of the invention relates to big data technology, in particular to a data retrieval method, a data retrieval device, a server and a storage medium.
Background
With the continuous expansion of the application field of the information system of the database and the continuous increase of the application time, the data volume rises sharply. In order to improve the query and operation efficiency of a large table, partition tables are widely used in an information system of a database, and a mainstream database management system supports the partition tables.
The partition table divides a large amount of data in one large table into different partitions for storage and management according to rules specified by a user, so that the operation and query efficiency of the table is greatly improved. Different searching modes for the partition table still cause great difference in query efficiency.
Generally, the search methods for the partition table are different depending on the search conditions for the partition table, and table 1 shows the correspondence between the search conditions and the search methods, as shown in table 1 below.
TABLE 1
Figure GDA0002719189760000011
Figure GDA0002719189760000021
As can be seen from table 1, except that all partitions need to be scanned in the full-partition search, the number of search partitions can be reduced in other ways according to the range of search conditions, so that the query efficiency is improved.
However, although the range search method can reduce the number of scanning partitions, the search condition comparison is required for each piece of data of the partition to be searched, and in the case of a large amount of partition data, the search time is long, the search speed is slow, and the influence on the query performance is great.
Disclosure of Invention
The invention provides a data retrieval method, a data retrieval device, a server and a storage medium, which solve the problem that each piece of data of a partition to be retrieved needs to be matched with a retrieval condition in the prior art.
In a first aspect, an embodiment of the present invention provides a data retrieval method, including:
acquiring a data retrieval condition;
determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval conditions;
and determining a data retrieval result, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
In a second aspect, an embodiment of the present invention further provides a data retrieval apparatus, including:
the acquisition module is used for acquiring data retrieval conditions;
the first determining module is used for determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval conditions;
and the second determining module is used for determining a data retrieval result, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
In a third aspect, an embodiment of the present invention further provides a server, where the server includes:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a data retrieval method as described in the first aspect.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions for performing the data retrieval method as described in the first aspect when executed by a computer processor.
The data retrieval method, the data retrieval device, the server and the storage medium provided by the embodiment acquire data retrieval conditions; determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and data retrieval conditions; the data retrieval result is determined, the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved, the data retrieval result is determined in the partition to be retrieved according to the data retrieval condition and the partition characteristics, and then the data retrieval result is determined in the partition to be retrieved according to the data retrieval condition, so that the problem that each piece of data in the partition to be retrieved needs to be matched with the retrieval condition in the prior art is solved, the time cost of data retrieval is reduced, and the query efficiency is improved.
Drawings
Fig. 1 is a flowchart of a data retrieval method according to an embodiment of the present invention;
FIG. 2 is a flowchart of a data retrieval method according to a second embodiment of the present invention;
fig. 3 is a flowchart of a data retrieval method according to a third embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data retrieval device according to a fourth embodiment of the present invention;
fig. 5 is a schematic structural diagram of a server according to a fifth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
Example one
Fig. 1 is a flowchart of a data retrieval method according to an embodiment of the present invention, which is applicable to a situation of retrieving partition table data, and the method can be executed by a data retrieval device.
In the present embodiment, the database is an organized, sharable collection of data that is stored long term within the computer. Tables are the basic units of data storage in databases and are logical entities that manipulate data. The table consists of columns and rows, each row representing a separate record. The table contains a fixed set of columns, also referred to as fields. The current common table partitioning is horizontal partitioning, data rows in a table are stored in different partitions according to specified rules, and the partition types include three types, namely range partitioning, list partitioning and hash partitioning. For ease of understanding, the present embodiment will briefly describe the concepts of these three partitions.
Range partitioning refers to partitioning a range of one or more columns in a table, and determining on which partition to store the data based on the range in which the column values are located. For example, the partition is performed according to the sequence number, the partition is performed according to the creation date of the service record, and the like.
Illustratively, create table t1(c1 int, c2 int) partition by range (c1) (partition p1 values less range (10),
partition p2 values less than(20));
a range partition table t1 is created, partition column c1, containing two partitions p1, p 2. Wherein the range of p1 is: c1< 10; all data for c1<10 are stored in the p1 partition, and c1<10 is a partition characteristic of the p1 partition. The p2 range is: 10 ≦ c1< 20; all data of 10 ≦ c1<20 are stored in the p2 partition. 10 ≦ c1<20 is the partition characteristic for the p2 partition. It should be noted that the data range of the partition characteristics between adjacent partitions is continuous.
The hash partitions are uniformly distributed according to the hash values of the column values, and the data hashed by the partitions are equal as much as possible.
Illustratively, create table t2(c3 int, c4 int) partition by hash (c3)
(partition p3,partition p4);
A hashed partition table t2 is created, with partition column c3, containing two partitions p3, p 4. And calculating the data through a hash function in the database to obtain a hash value, and performing modulo operation on the hash value and the partition number to obtain the partition to which the hash value belongs.
List partitioning refers to explicitly specifying that partitioning is performed according to a specific value of a column, rather than according to a range of values of a column as in range partitioning. Such as: partitioning according to location, color, etc.
The following are exemplary: create table t3(c5 varchar (10), c6 int) partition by list (c5)
(partition p5 values ('Shanghai', 'Beijing'),
partition p6 values ('Guangzhou', 'Shenzhen');
a list partition table t3 is created, partition column c5, containing two partitions p5, p 6. Wherein c5 partition columns of all data in the p5 partition are 'Shanghai' or 'Beijing', and c5 partition columns of all data in the p6 partition are 'Guangzhou' or 'Shenzhen'. c5 is the partition characteristic of the p5 partition in 'Shanghai' or 'Beijing', and c5 is the partition characteristic of the p6 partition in 'Guangzhou' or 'Shenzhen'.
Specifically, as shown in fig. 1, the data retrieval method provided in the embodiment of the present invention mainly includes the following steps:
and S110, acquiring data retrieval conditions.
In this embodiment, the data retrieval condition refers to information related to data that the user needs to acquire. Preferably, the data retrieval condition may be a single numerical value, or one data range, or a plurality of data ranges, or a single discrete quantity, or a plurality of discrete quantities, and the like. Illustratively, the data retrieval conditions may be: x <20, x ═ 20, x may be the traffic of guangzhou, and the like. The specific data retrieval condition may be set according to the actual condition of the partition table. In the present embodiment, the data search condition is merely described, and is not limited.
Further, in this embodiment, the manner of acquiring the data retrieval condition may be set according to actual conditions. For example: when the data retrieval statement sent by the external equipment is set to be received, the retrieval statement is analyzed, and the data retrieval statement is analyzed into the data retrieval condition which can be identified by the database. Wherein, external device includes but is not limited to: computers, smart phones, etc. For another example, after the database executes a specific function, the data retrieval conditions are automatically generated and acquired.
And S120, determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and data retrieval conditions.
In this embodiment, the partition characteristics refer to conditions or ranges for partitioning the table. When a table is partitioned, data that conforms to the partition characteristics is stored in one partition. The to-be-retrieved partition refers to a partition with intersection of partition characteristics and data retrieval conditions.
And further, determining the partition to be retrieved according to the data retrieval condition and the partition characteristics. Specifically, the data search condition is divided into a partitioned data search condition and a non-partitioned data search condition. The partition data search condition is a search condition for a table partition column by a pointer, and the non-partition data search condition is a search condition for a non-table partition column. Illustratively, the data retrieval condition is c1>12and c1<35and c2>10, if the partition column of the table is c1, c1>12and c1<35 is the partition data retrieval condition, and c2>10 is the non-partition data retrieval condition.
S130, determining a data retrieval result, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
In this embodiment, the data retrieval result is a data set that meets the data retrieval condition in the partition to be retrieved. Further, when the non-partition search condition exists in the data search condition, the data search result should be a data set satisfying both the partition search condition and the non-partition search condition. In this embodiment, the data retrieval result may be one piece of data or a set composed of a plurality of pieces of data.
In this embodiment, the form of the partition feature includes a numerical range or a feature list. The partition mode in which the partition characteristic is a numerical range is called range partition, and the partition mode in which the partition characteristic is a characteristic list is called list partition. In the present embodiment, the form of the partition feature is mainly described as two cases, namely, a numerical range and a feature list.
In this embodiment, a way of determining the data retrieval results is provided. And acquiring any partition to be retrieved as a current partition to be retrieved, and acquiring the form of the characteristics of the current partition to be retrieved. Wherein the partition characteristics and the partition characteristic forms are stored in a database.
Judging whether the form of the characteristics of the partition to be retrieved currently is a numerical range or a characteristic list, and if the form of the characteristics of the partition to be retrieved currently is the numerical range, judging whether the partition to be retrieved currently is a starting partition of the partition to be retrieved or an ending partition of the partition to be retrieved according to the partition characteristics; and if the current partition to be retrieved is not the starting partition and is not the ending partition, determining that all data in the current partition to be retrieved is in the data retrieval result.
It should be noted that the starting partition of the partition to be retrieved refers to the first partition to be retrieved in all the current partitions to be retrieved; the ending partition of the partition to be retrieved refers to the last partition to be retrieved in all the current partitions to be retrieved.
Further, if the form of the partition features to be retrieved currently is a feature list, judging whether the data retrieval conditions include all partition features of the partition to be retrieved currently; if the data retrieval conditions comprise all partition characteristics of the current partition to be retrieved; determining that all data in the partition to be retrieved currently is in the data retrieval result. If all the partition characteristics of the current partition to be retrieved are not covered by the data retrieval conditions; and matching all data in the current partition to be retrieved with the data retrieval conditions in sequence, and further determining a data set which accords with the data retrieval conditions in the current partition to be retrieved.
The data retrieval method provided by the embodiment comprises the following steps: acquiring a data retrieval condition to determine a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval condition; and determining a data retrieval result, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved. According to the embodiment, after the partition to be retrieved is determined according to the data retrieval condition and the partition characteristics, the data retrieval result is determined in the partition to be retrieved according to the data retrieval condition, so that the problem that each piece of data in the partition to be retrieved needs to be matched with the retrieval condition in the prior art is solved, the time cost of data retrieval is reduced, and the query efficiency is improved.
Example two
Fig. 2 is a flowchart of a data retrieval method according to a second embodiment of the present invention, and in this embodiment, based on the foregoing embodiments, it is preferable that the data retrieval method is further optimized, and as shown in fig. 2, the optimized data retrieval method includes the following specific steps:
s210, acquiring data retrieval conditions.
And S220, determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and data retrieval conditions.
And S230, if the partition characteristics are the subset of the data retrieval conditions, all the data in the partition to be retrieved corresponding to the partition characteristics are in the data retrieval result.
Further, the partition is characterized by a subset of data retrieval conditions including: the partition characteristics and the data retrieval conditions are in the form of numerical ranges, and then the partition characteristics whose numerical ranges are covered by the numerical ranges of the data retrieval conditions are subsets of the data retrieval conditions.
In this embodiment, the partition to be retrieved, which is determined according to the data retrieval condition, is a partition whose partition characteristics are continuous.
Illustratively, create table t4(x1 int, x2 int) partition by range (x1)
(partition r1 values less than(10),
partition r2 values less than(20),
partition r3 values less than(30),
partition r4 values less than(40));
A partition table t4 is created, partition table t4 includes 4 partitions, r1, r2, r3 and r4 respectively. The partition characteristic of the to-be-retrieved partition r1 is x1<10, the partition characteristic of the to-be-retrieved partition r2 is not less than 10 and not more than x1<20, the partition characteristic of the to-be-retrieved partition r3 is not less than 20 and not more than x1<30, and the partition characteristic of the to-be-retrieved partition r4 is not less than 30 and not more than x1< 40.
Illustratively, if the data retrieval condition is that x1 is greater than or equal to 15 and less than 35, the partition to be retrieved is r2, r3 and r4, wherein the partition to be retrieved r2 is a starting partition, the partition to be retrieved r4 is a terminating partition, and the partition to be retrieved r3 is neither a starting partition nor a terminating partition, so that all data in the partition to be retrieved r3 is directly used as the data retrieval result, and it is no longer necessary to match each data in the partition to be retrieved r3 with the data retrieval condition in sequence, thereby reducing the retrieval workload and saving the time cost. Since the partition to be retrieved r2 is a starting partition, the partition to be retrieved r4 is a terminating partition, and the partition characteristics of r 2and r4 are not fully covered by the data retrieval conditions, each data in the partitions to be retrieved r 2and r4 needs to be sequentially matched with the data retrieval conditions, and the successfully matched data set is used as a data retrieval result.
Further, if the data retrieval condition is that x1 is greater than or equal to 10 and less than 35, the partition to be retrieved is r2, r3 and r4, wherein the partition to be retrieved r2 is a starting partition, and the partition to be retrieved r4 is a terminating partition, and since the partition characteristic 10 of the partition to be retrieved r2 is greater than or equal to x1 and less than 20 includes the starting value x1 of the data retrieval condition, and then the partition characteristic 10 of the partition to be retrieved r2 is greater than or equal to x1 and less than 20 is covered by the numerical range 10 of the data retrieval condition and less than or equal to x1 and less than 35, all the data in the partition to be retrieved r2 is directly used as the data retrieval result, and it is no longer necessary to match each data in the partition to be retrieved r2 with the data retrieval condition in sequence, thereby reducing the retrieval workload and saving the time. The partition r3 to be retrieved is neither a start partition nor an end partition, so that all data in the partition r3 to be retrieved is directly used as a data retrieval result, and it is no longer necessary to match each data in the partition r3 to be retrieved with data retrieval conditions in sequence. Since the partition r4 to be retrieved is the termination partition and the partition characteristics of the partition r4 to be retrieved are not fully covered by the data retrieval conditions, each piece of data in the partition r4 to be retrieved needs to be sequentially matched with the data retrieval conditions, and the successfully matched data set is used as a data retrieval result.
Further, if the data retrieval condition is that x1 is greater than or equal to 15 and less than 40, the partition to be retrieved is r2, r3 and r4, wherein the partition to be retrieved r2 is a starting partition, the partition to be retrieved r4 is a terminating partition, and the partition to be retrieved r3 is neither a starting partition nor a terminating partition, so that all data in the partition to be retrieved r3 is directly used as the data retrieval result, and it is no longer necessary to sequentially match each data in the partition to be retrieved r3 with the data retrieval condition. Since the termination value of the partition feature 30 of the partition r4 to be retrieved is not greater than x1<40 and does not include x1 or 40, and the termination value of the data retrieval condition 15 is not greater than x1<40 and does not include x1 or 40, the partition feature 30 of the partition r4 to be retrieved is not greater than x1<40, and the numerical range of the data retrieval condition 15 is not greater than x1<40, all data in the partition r4 to be retrieved is directly used as the data retrieval result, and it is no longer necessary to match each data in the partition r4 to be retrieved with the data retrieval condition in sequence, so that the retrieval workload is reduced, and the time cost is saved. Since the partition r2 to be retrieved is the starting partition and the partition characteristics of the partition r2 to be retrieved are not fully covered by the data retrieval conditions, each piece of data in the partition r2 to be retrieved needs to be sequentially matched with the data retrieval conditions, and the successfully matched data set is used as a data retrieval result.
Further, if the data retrieval condition is x1<25, the partitions to be retrieved are r1, r 2and r3, wherein the partition r3 to be retrieved is a termination partition, and it can be determined that the retrieval is smaller than the search according to the data retrieval condition, partition features of other partitions to be retrieved except the termination partition are certainly covered by the numerical range x1<25 of the data retrieval condition, and all data in the partitions r1 and r2 to be retrieved are directly used as data retrieval results, and it is no longer necessary to match each data in the partitions r1 and r2 to be retrieved with the data retrieval condition in sequence, so that the retrieval workload is reduced, and the time cost is saved. Since the partition r3 to be retrieved is the termination partition and the partition characteristics of the partition r3 to be retrieved are not fully covered by the data retrieval conditions, each piece of data in the partition r3 to be retrieved needs to be sequentially matched with the data retrieval conditions, and the successfully matched data set is used as a data retrieval result.
Further, if the data retrieval condition is x1<30, the partitions to be retrieved are r1, r 2and r3, wherein the partition r3 to be retrieved is a termination partition, and it can be determined that the retrieval is smaller than the search according to the data retrieval condition, partition features of partitions to be retrieved except the termination partition are certainly covered by the numerical range x1<25 of the data retrieval condition, and all data in the partitions r1 and r2 to be retrieved are directly used as the data retrieval result. Further, the partition characteristic 20 ≦ x1<30 of the termination partition r3 is fully covered by the data retrieval condition x1<30, and then all data in the partition r3 to be retrieved is directly used as the data retrieval result.
Further, if the data retrieval condition is x ≧ 10, the partition to be retrieved is r2, r3, and r4, where the partition to be retrieved r2 is the starting partition, and it can be determined that the retrieval is greater than the search according to the data retrieval condition, partition features of other partitions to be retrieved except the starting partition are certainly covered by the numerical range x ≧ 10 of the data retrieval condition, and all data in the partitions to be retrieved r3 and r4 are directly used as the data retrieval result. Furthermore, the partition characteristic 10 ≤ x1<20 of the starting partition r2 is fully covered by the data retrieval condition x ≥ 10, and then all data in the starting partition r2 are directly used as the data retrieval result.
Further, if the data retrieval condition is x >10, the partitions to be retrieved are r2, r3 and r4, wherein the partition r2 to be retrieved is a starting partition, and it can be determined that the retrieval is greater than the search according to the data retrieval condition, partition features of other partitions to be retrieved except the starting partition are certainly covered by the numerical range x of the data retrieval condition being greater than or equal to 10, and all data in the partitions r3 and r4 to be retrieved are directly used as the data retrieval result. Since the partition characteristics of the starting partition r2 include x1 ═ 10, it is necessary that each data in the starting partition r2 is sequentially matched with the data retrieval conditions, and the successfully matched data set is used as the data retrieval result.
Further, if the data search condition is 15 ≤ x1<35and y <12, then 15 ≤ x1<35 is the partition search condition, y <12 is the non-partition search condition, the partitions to be searched are r2, r3 and r4, and for the non-partition search condition y <12, the data search result obtained according to the partition search condition needs to be matched with the non-partition search condition to determine the data search result.
Further, the partition is characterized by a subset of data retrieval conditions including: the partition characteristics and the data retrieval conditions are in the form of a characteristic list, and the partition characteristics of all the characteristics in the data retrieval conditions are subsets of the data retrieval conditions.
In this embodiment, the form of the partition characteristic and the data retrieval condition as the characteristic list means that the partition characteristic and the data retrieval condition are specific values of a certain column.
Illustratively, create table t5(x3 varchar (10), x4 int) partition by list (x3)
(partition r5 values ('Shanghai', 'Beijing'),
partition r6 values ('Guangzhou', 'Shenzhen'));
a partition table t5 is created, partition table t5 includes 2 partitions, r 5and r6, respectively. The partition characteristics of the partition to be retrieved r5 are "shanghai" and "beijing", and the partition characteristics of the partition to be retrieved r6 are "guangzhou" and "shenzhen".
If the data retrieval conditions are Shanghai and Beijing, all data in the to-be-retrieved partition r5 are directly used as data retrieval results, and each data in the to-be-retrieved partition r5 does not need to be matched with the data retrieval conditions in sequence, so that the retrieval workload is reduced, and the time cost is saved.
If the data retrieval condition is "shanghai", each piece of data in the partition to be retrieved r5 needs to be sequentially matched with the data retrieval condition, and a successfully matched data set is used as a data retrieval result.
The data retrieval method provided by the embodiment comprises the following steps: acquiring a data retrieval condition to determine a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval condition; in this embodiment, after the retrieval partition is determined according to the data retrieval condition and the partition characteristic, all data in the partition to be retrieved, of which the partition characteristic is the data retrieval condition subset, are determined as the data retrieval result, so that the problem that each piece of data in the partition to be retrieved needs to be matched with the retrieval condition in the prior art is solved, time cost of data retrieval is reduced, and query efficiency is improved.
EXAMPLE III
Fig. 3 is a flowchart of a data retrieval method according to a third embodiment of the present invention. The present embodiment may provide a preferable example based on the above-described embodiments. As shown in fig. 3, the data retrieval method provided in this embodiment mainly includes the following steps:
s301, analyzing the statement, and dividing the data retrieval condition into a partition data retrieval condition and a non-partition data retrieval condition. In this embodiment, a query statement for a partition table input by a user is received, and the language input by the user usually adopts the SQL format. The data retrieval conditions are divided into partitioned data retrieval conditions and non-partitioned data retrieval conditions. Note that, in the data search condition, there may be no partitioned data search condition or no non-partitioned data search condition. The determination needs to be performed according to the query statement of the user, and it is not mandatory that a partitioned data retrieval condition or a non-partitioned data retrieval condition exist.
S302, judging whether the data retrieval conditions comprise partition data retrieval conditions or not, and if so, executing S303; if the partition data search condition does not exist, S307 is executed.
S303, judging whether the data retrieval conditions have non-partition data retrieval conditions, and if the non-partition data retrieval conditions exist, executing S304; if there is no non-partitioned data retrieval condition, S306 is executed.
S304, generating a partition retriever according to the partition data retrieval condition, and setting an optimization mark.
In this embodiment, the search symbol is another expression of the search condition in the partition table. And generating a partition retriever corresponding to the partition retrieval condition according to the partition data retrieval condition, and setting an optimization mark in the partition retriever. In the present embodiment, the optimization flag is set so that the search condition requiring the optimization processing can be identified in the process of performing data matching.
S305, the newly generated partition searching character is used as a child node of the non-partition searching character.
In this embodiment, it should be noted that, when searching for an access path and an execution procedure that require selection of a partition table object in the partition table, a corresponding execution plan tree is generated. In this embodiment, the node corresponding to the partition search character is used as a child node of the node corresponding to the non-partition search character, and represents that the node corresponding to the partition search character is executed first during execution, and the node corresponding to the non-partition search character is executed after the obtained data result.
S306, if the partition retrieval condition exists and the non-partition retrieval condition does not exist, generating a normal partition retrieval character and setting an optimization mark.
S307, determining all the partitions to be retrieved according to the partition retriever.
In the embodiment, the partition to be retrieved is determined according to the partition retriever. Note that, in all the data search conditions, a search symbol corresponding to the data search condition is generated, and the non-partition data search condition generates a non-partition search symbol corresponding to the non-partition data search condition, but the non-partition search symbol corresponding to the non-partition data search condition is not provided with an optimization flag. In this embodiment, one or more partitions to be retrieved may be determined.
And S308, acquiring the partition to be retrieved, and recording as the current partition to be retrieved.
In this embodiment, the partitions to be retrieved are sequentially obtained and recorded as the current partitions to be retrieved.
S309, judging whether the current partition to be retrieved is a range partition or not, wherein the retriever has an optimization mark.
In this embodiment, it is determined whether the partition type of the current partition to be retrieved is a range partition, and the retrieval operator has an optimization flag. If yes, go to step S310, otherwise go to step S314.
And S310, judging whether the current partition to be retrieved is the initial partition or the final partition of all partitions to be retrieved, if so, executing S311, and if not, executing S312.
In this embodiment, the starting partition of the partition to be retrieved refers to a first partition to be retrieved among all partitions to be retrieved; the ending partition of the partition to be retrieved refers to the last partition to be retrieved in all the current partitions to be retrieved.
And S311, whether the boundary value of the partition feature to be searched currently is completely covered by the search condition of the search character. If yes, go to S312, otherwise go to S316.
In this embodiment, the boundary value of the partition feature to be currently retrieved includes a start value and an end value.
S312, omitting the execution of the retrieval conditions, and directly taking all data in the partition to be retrieved as a retrieval result.
In the present embodiment, ignoring the execution of the retrieval condition means that it is not necessary to compare and match each data of the partition to be retrieved currently with the data retrieval condition.
S313 and determines whether all the partitions to be retrieved have been processed, if so, then S317 is executed, and if not, then S308 is executed.
And judging whether all the partitions to be retrieved are processed completely, wherein a specific judgment method can be to check whether the current partition to be retrieved terminates the partition to be retrieved.
S314, judging whether the list is partitioned or not, wherein the search symbol has an optimization mark. If yes, go to step S315, otherwise, go to step S316.
And S315, judging whether the partition characteristics of the partition to be searched currently are completely covered by the searching conditions of the search character, if so, executing S312, and if not, executing S316.
And S316, normally searching the current partition to be searched, and executing according to the searching condition. After the normal retrieval, S313 is executed.
In this embodiment, normally retrieving the current partition to be retrieved refers to comparing and matching each data of the current partition to be retrieved with the data retrieval condition.
And S317, finishing the search.
In the data retrieval special case provided by this embodiment, after the retrieval partition is determined according to the data retrieval condition and the partition characteristic, all data in the partition to be retrieved, of which the partition characteristic is the data retrieval condition subset, are determined as the data retrieval result, so that the problem that each piece of data in the partition to be retrieved needs to be matched with the retrieval condition in the prior art is solved, the time cost of data retrieval is reduced, and the query efficiency is improved.
Example four
Fig. 4 is a schematic structural diagram of a data retrieving device according to a fourth embodiment of the present invention, which is applicable to a situation of retrieving data in a partition table in this embodiment, as shown in fig. 4, the data retrieving device mainly has the following structure: an obtaining module 410, configured to obtain a data retrieval condition;
a first determining module 420, configured to determine a partition to be retrieved, where the partition to be retrieved is a partition whose partition characteristics intersect with the data retrieval conditions;
the second determining module 430 is configured to determine a data retrieval result, where the data retrieval result is a data set in the partition to be retrieved, where the data set meets the data retrieval condition.
The data retrieval device provided by the embodiment comprises: acquiring a data retrieval condition to determine a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval condition; the data retrieval result is determined, and the data retrieval result is a data set which meets the data retrieval condition in the partition to be retrieved.
On the basis of the foregoing technical solution, specifically, the second determining module 430 is specifically configured to: and if the partition characteristics are the subset of the data retrieval conditions, all the data in the partition to be retrieved corresponding to the partition characteristics are in the data retrieval result.
Further, the partition is characterized by a subset of data retrieval conditions including:
the partition characteristics and the data retrieval conditions are in the form of numerical ranges, and then the partition characteristics whose numerical ranges are covered by the numerical ranges of the data retrieval conditions are subsets of the data retrieval conditions.
Further, the partition is characterized by a subset of data retrieval conditions including:
the partition characteristics and the data retrieval conditions are in the form of a characteristic list, and the partition characteristics of all the characteristics in the data retrieval conditions are subsets of the data retrieval conditions
The data retrieval device provided by the embodiment of the invention can execute the data retrieval method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
EXAMPLE five
Fig. 5 is a schematic structural diagram of a server according to a fifth embodiment of the present invention, as shown in fig. 5, the server includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the server may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 in the server may be connected by a bus or other means, and the bus connection is exemplified in fig. 5.
The memory 520, which is a computer-readable storage medium, may be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the data retrieval method in the embodiment of the present invention (for example, the obtaining module 410, the first determining module 420, and the second determining module 430 in the data retrieval device). The processor 510 executes various functional applications of the server and data processing by executing software programs, instructions, and modules stored in the memory 520, thereby implementing the data retrieval method described above.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, the memory 520 may further include memory located remotely from the processor 510, which may be connected to the device/terminal/server via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the server. The output device 540 may include a display device such as a display screen.
EXAMPLE six
An embodiment of the present invention further provides a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a data retrieval method, including:
acquiring a data retrieval condition;
determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and data retrieval conditions;
and determining a data retrieval result, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the data retrieval method provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the data retrieval device, the included units and modules are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A method of data retrieval, comprising:
acquiring a data retrieval condition;
determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval conditions;
and determining a data retrieval result according to the data retrieval condition and the partition characteristics of the partition to be retrieved, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
2. The data retrieval method of claim 1, wherein the determining the data retrieval result comprises:
and if the partition characteristics are the subset of the data retrieval conditions, all data in the partition to be retrieved corresponding to the partition characteristics are in the data retrieval result.
3. The data retrieval method of claim 2, wherein the partition characteristic is a subset of the data retrieval conditions comprising:
the form of the partition characteristic and the data retrieval condition is a numerical range, and then the partition characteristic whose numerical range is covered by the numerical range of the data retrieval condition is a subset of the data retrieval condition.
4. The data retrieval method of claim 2, wherein the partition characteristic is a subset of the data retrieval conditions comprising:
the partition characteristics and the data retrieval conditions are in the form of a characteristic list, and then the partition characteristics of all the characteristics in the data retrieval conditions are subsets of the data retrieval conditions.
5. A data retrieval device, comprising:
the acquisition module is used for acquiring data retrieval conditions;
the first determining module is used for determining a partition to be retrieved, wherein the partition to be retrieved is a partition with intersection of partition characteristics and the data retrieval conditions;
and the second determining module is used for determining a data retrieval result according to the data retrieval condition and the partition characteristics of the partition to be retrieved, wherein the data retrieval result is a data set which accords with the data retrieval condition in the partition to be retrieved.
6. The data retrieval device of claim 5, wherein the second determination module is specifically configured to:
and if the partition characteristics are the subset of the data retrieval conditions, all data in the partition to be retrieved corresponding to the partition characteristics are in the data retrieval result.
7. The data retrieval device of claim 6 wherein the partition characteristic is a subset of the data retrieval conditions comprising:
the form of the partition characteristic and the data retrieval condition is a numerical range, and then the partition characteristic whose numerical range is covered by the numerical range of the data retrieval condition is a subset of the data retrieval condition.
8. The data retrieval device of claim 6 wherein the partition characteristic is a subset of the data retrieval conditions comprising:
the partition characteristics and the data retrieval conditions are in the form of a characteristic list, and then the partition characteristics of all the characteristics in the data retrieval conditions are subsets of the data retrieval conditions.
9. A server, characterized in that the server comprises:
one or more processors;
a memory for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement a data retrieval method as recited in any one of claims 1-4.
10. A storage medium containing computer-executable instructions for performing the data retrieval method of any one of claims 1-4 when executed by a computer processor.
CN201811196264.3A 2018-10-15 2018-10-15 Data retrieval method, device, server and storage medium Active CN109299101B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811196264.3A CN109299101B (en) 2018-10-15 2018-10-15 Data retrieval method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811196264.3A CN109299101B (en) 2018-10-15 2018-10-15 Data retrieval method, device, server and storage medium

Publications (2)

Publication Number Publication Date
CN109299101A CN109299101A (en) 2019-02-01
CN109299101B true CN109299101B (en) 2020-12-01

Family

ID=65162782

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811196264.3A Active CN109299101B (en) 2018-10-15 2018-10-15 Data retrieval method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN109299101B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569257B (en) * 2019-09-16 2022-04-01 上海达梦数据库有限公司 Data processing method, corresponding device, equipment and storage medium
CN110825943B (en) * 2019-10-23 2023-10-10 支付宝(杭州)信息技术有限公司 Method, system and equipment for generating user access path tree data
CN113377989A (en) * 2021-06-04 2021-09-10 上海云从汇临人工智能科技有限公司 Data retrieval method, system, medium and device based on GPU
CN117573944B (en) * 2024-01-17 2024-04-02 深圳十沣科技有限公司 Data retrieval method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633094A (en) * 2017-10-11 2018-01-26 江苏神州信源系统工程有限公司 The method and apparatus of data retrieval in a kind of cluster environment
CN108170719A (en) * 2017-12-05 2018-06-15 深圳市金立通信设备有限公司 A kind of search method, server and computer readable storage medium
CN108427736A (en) * 2018-02-28 2018-08-21 华为技术有限公司 A method of for inquiring data

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101551826B (en) * 2009-05-19 2011-10-05 成都市华为赛门铁克科技有限公司 Data retrieval process, set and system
CN103714096B (en) * 2012-10-09 2018-02-13 阿里巴巴集团控股有限公司 Inverted index system constructing, data processing method and device based on Lucene

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107633094A (en) * 2017-10-11 2018-01-26 江苏神州信源系统工程有限公司 The method and apparatus of data retrieval in a kind of cluster environment
CN108170719A (en) * 2017-12-05 2018-06-15 深圳市金立通信设备有限公司 A kind of search method, server and computer readable storage medium
CN108427736A (en) * 2018-02-28 2018-08-21 华为技术有限公司 A method of for inquiring data

Also Published As

Publication number Publication date
CN109299101A (en) 2019-02-01

Similar Documents

Publication Publication Date Title
CN109299101B (en) Data retrieval method, device, server and storage medium
CN107943952B (en) Method for realizing full-text retrieval based on Spark framework
US11003649B2 (en) Index establishment method and device
CN108733790B (en) Data sorting method, device, server and storage medium
CN104111958A (en) Data query method and device
WO2021259217A1 (en) Data association query method and apparatus, and device and storage medium
US9734177B2 (en) Index merge ordering
CN110968593A (en) Database SQL statement optimization method, device, equipment and storage medium
CN110597852A (en) Data processing method, device, terminal and storage medium
CN108549688B (en) Data operation optimization method, device, equipment and storage medium
CN110928900B (en) Multi-table data query method, device, terminal and computer storage medium
CN105302827A (en) Event search method and device
CN110515979B (en) Data query method, device, equipment and storage medium
CN112199390A (en) Data query method, device, equipment and storage medium in database
WO2022253131A1 (en) Data parsing method and apparatus, computer device, and storage medium
US20220215021A1 (en) Data Query Method and Apparatus, Computing Device, and Storage Medium
CN116450607A (en) Data processing method, device and storage medium
CN111125216A (en) Method and device for importing data into Phoenix
CN107291875B (en) Metadata organization management method and system based on metadata graph
CN112765118B (en) Log query method, device, equipment and storage medium
CN112835905B (en) Array type column indexing method, device, equipment and storage medium
CN115114325A (en) Data query method and device, electronic equipment and storage medium
CN114254005A (en) Grouping aggregation query method and device for partition table, computer equipment and medium
EP3793171B1 (en) Message processing method, apparatus, and system
CN116204546A (en) SQL precompilation method, SQL precompilation device, SQL precompilation server and SQL precompilation storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant