WO2018040722A1 - 表数据查询方法及装置 - Google Patents

表数据查询方法及装置 Download PDF

Info

Publication number
WO2018040722A1
WO2018040722A1 PCT/CN2017/091217 CN2017091217W WO2018040722A1 WO 2018040722 A1 WO2018040722 A1 WO 2018040722A1 CN 2017091217 W CN2017091217 W CN 2017091217W WO 2018040722 A1 WO2018040722 A1 WO 2018040722A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
data
partition
data tables
partitions
Prior art date
Application number
PCT/CN2017/091217
Other languages
English (en)
French (fr)
Inventor
秦传瑜
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2018040722A1 publication Critical patent/WO2018040722A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Definitions

  • the present invention relates to the field of information technology, and in particular, to a table data query method and apparatus.
  • the application data table in order to increase the table data access rate, can be split into multiple sub-tables according to the partition key value, and each sub-table is stored in a different database.
  • the partition key value includes a field value, a hash value calculated by the hash value of the field value, and the like.
  • the application can send table data to the DDS (Distributed Data Service) when the application needs to query the table data satisfying certain conditions during the running process of the application.
  • the data query request carries multiple data table identifiers and query parameters
  • the DDS obtains the sub-table corresponding to the data table identifier from the corresponding database according to each data table identifier, and in the process, the application will also
  • the DDS sends the parameter value of the query parameter, and the DDS selects the table data satisfying certain conditions from the plurality of data tables according to the parameter value of the query parameter, and then sends the filtered table data to the application.
  • the existing query mode requires the DDS to perform secondary processing on multiple data tables, which reduces the query efficiency of the table data, and the amount of data transmitted in the entire query process is large, resulting in poor service performance.
  • an embodiment of the present invention provides a data table query method and apparatus.
  • the technical solution is as follows:
  • a data table query method which is applied to a query node in which a DDS is installed, and the method includes:
  • the application may trigger the terminal to send the table data query request, and when receiving the table data query request, the query node receives the table.
  • the data query request is parsed to obtain query information including a data table identifier and a query parameter, and the query information includes a query partition key of the plurality of data tables and a query condition of the plurality of data tables.
  • the application will also trigger the terminal to send a parameter value of the query parameter, the parameter value including at least the query partition key value of the plurality of data tables.
  • the query node determines the query partition of the plurality of data tables according to the query partition key value of the plurality of data tables, and further determines whether the query partitions of the multiple data tables are the same. If the query partitions of the multiple data tables are the same, the received table is received. The data query request is sent to the target database corresponding to the same query partition, and the target database performs the query and returns the final query result. If the final query result is received, the query node sends the final query result to the terminal, which is sent by the terminal to the application. In the case that the query partitions of multiple data tables are the same, by sending the table data query request to a database, the query node can obtain the table data that meets the requirements without performing secondary processing, thereby improving the query efficiency of the table data. , improve business performance.
  • the query node is in determining the plurality of data tables When the query partitions are the same, it is first determined whether all the query partition keys of the multiple data tables have an association relationship. If all the query partition keys of the multiple data tables have an association relationship, the query of the multiple data tables may be determined. The partitions are the same.
  • the query partition keys of multiple data tables are not all related, but according to the query partition key values of multiple data tables, it is determined that the query partition key values of multiple data tables are all equal, then it can be determined that The query partitions of the data tables are the same; if the query partition keys of the multiple data tables are not all associated, and the query partition key values of the multiple data tables are not all equal, it is determined that the query partitions of the multiple data tables are different.
  • the query node can quickly determine whether the query partitions of the plurality of data tables involved are the same, so as to sink the received table data query request into a whole in a subsequent step.
  • the database not only improves query efficiency, but also reduces the amount of data stored on the query node, saving storage space.
  • the query node after the query node parses the table data query request, and before receiving the parameter value of the query parameter, the query node further queries the request according to the query information and the data table. , generating a sub-data table query request with the same number of data table identifiers, each sub-data table query request corresponding to a data table, which can be used to query the table data in the data table.
  • the query when it is determined that the query partitions of the plurality of data tables are different, the query is performed according to the generated plurality of sub-data table query requests, so as to ensure that the table data satisfying the requirements can be obtained under any circumstances.
  • the query node may partition the query when determining that the query partitions of the multiple data tables are different
  • the key value is added to the corresponding sub-data table query request, and multiple data table query requests with the query partition key value added are sent to multiple partition databases, and each partition database is queried, and the intermediate query result is returned, the partition
  • the database is a database corresponding to the query partition key value, and the intermediate query result includes the sub-table stored by the partition database or the table data queried from the sub-table.
  • the query node When receiving the intermediate query result sent by the multiple partition databases, the query node queries the plurality of intermediate query results according to the parameter values of the query parameters, and obtains the final query result, and then sends the final query result to the terminal, and the terminal sends the result to the terminal.
  • the application so that when the query partitions of multiple data tables are different, the table data satisfying the requirements can also be queried, and the reliability of the query is improved.
  • a data table querying apparatus for performing the data table query method of the first aspect described above.
  • a computer device for performing the data table query method of the first aspect described above.
  • the received table data query request is sent to the database corresponding to the query partition, and the final query result returned by the database is further obtained.
  • the process does not need to perform secondary processing on multiple data tables, which not only improves the query efficiency of the table data, but also reduces the amount of data transmitted and improves the business performance.
  • FIG. 1 is an architectural diagram of a distributed database system according to an embodiment of the present invention
  • FIG. 2 is an illustrative computer architecture of a computer device according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of a data table query method according to another embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a data table query apparatus according to another embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a data table query apparatus according to another embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a data table query apparatus according to another embodiment of the present invention.
  • FIG. 1 an architectural diagram of a distributed database system including a terminal 101, a query node 102, and a data node 103 is shown.
  • the terminal 101 can be a smart phone, a tablet computer, a desktop computer, etc., and the product type of the terminal 101 is not specifically limited in this embodiment.
  • various applications such as a shopping application, a navigation application, an instant messaging application, and the like are installed in the terminal 101.
  • the application may trigger the terminal 101 to send a table data query request to the query node 102, by querying the node. 102 queries according to the table data query request, and receives the query result sent by the query node 102.
  • the DDS is installed on the query node 102.
  • the DDS is used to provide a distributed data access service, and can receive a table data query request sent by the terminal 101, and query data from the database of the data node 103.
  • the query node 102 can be a single computing device or a computer cluster composed of multiple computing devices.
  • Data node 103 maintains a database that can be used to store data tables for applications.
  • the data node 103 can be a single computing device or a computer cluster composed of multiple computing devices.
  • the terminal 101 and the query node 102 can communicate with each other through a wired network or a wireless network, and the query node 102 and the data node 103 can communicate through a wired network or a wireless network.
  • the computing device 200 is a conventional desktop or laptop notebook and one or more computing devices 200 may constitute a physical platform.
  • the computing device 200 includes a processor 201, a memory 202, a communication interface 203, and a bus 204.
  • the processor 201, the memory 202, and the communication interface 203 are directly connected via a bus 204.
  • the computing device 200 can be used to execute a data table query method. specifically,
  • a memory 202 configured to store computer instructions
  • the processor 201 calls the computer instructions stored in the memory 202 via the bus 204 for performing the following operations:
  • Parsing the table data query request to obtain query information the query information includes a plurality of data table identifiers and query parameters, and the query parameters include at least a query partition key of the plurality of data tables;
  • the table data query request is sent to the target by calling the communication interface 203.
  • the database returns the final query result from the target database, and the target database is the database corresponding to the same query partition;
  • the final query result is sent to the application by invoking communication interface 203.
  • the processor 201 calls the computer instructions stored in the memory 202 via the bus 204, and is also used to perform the following operations:
  • query partition keys of multiple data tables are not all associated with each other, and the query partition key values of the multiple data tables are all equal, it is determined that the query partitions of the multiple data tables are the same;
  • query partition keys of multiple data tables are not all associated, and the query partition key values of multiple data tables are not all equal, it is determined that the query partitions of multiple data tables are different.
  • the processor 201 calls the computer instructions stored in the memory 202 via the bus 204, and is also used to perform the following operations:
  • a plurality of sub-data table query requests are generated, the number of the sub-data table query requests is the same as the number of the data table identifiers, and each sub-data table query request is used to query a data table.
  • the processor 201 calls the computer instructions stored in the memory 202 via the bus 204, and is also used to perform the following operations:
  • the plurality of sub-data table query requests for adding the query partition key value are sent to the plurality of partition databases by calling the communication interface 203, and the intermediate query result is returned by the partition database, and the partition database is the query partition key.
  • the database corresponding to the value, the intermediate query result includes a sub-table stored by the partition database or table data queried from the sub-table;
  • the final query result is sent to the application by invoking communication interface 203.
  • the memory 202 includes a computer storage medium.
  • Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
  • Computer storage media include RAM, ROM, EPROM, EEPROM, flash memory or other solid state storage technologies, CD-ROM, DVD or other optical storage, tape cartridges, magnetic tape, magnetic disk storage or other magnetic storage devices.
  • the computing device 200 can also be operated by a remote computer connected to the network via a network such as the Internet. That is, computing device 200 can be connected to the network via network interface unit 205 coupled to said bus 204, or network interface unit 205 can be used to connect to other types of networks or remote computer systems (not shown).
  • the computer device determines that the query table of the plurality of data tables is the same according to the query partition key value of the plurality of data tables, and sends the received table data query request to the database corresponding to the query partition, and further The final query result returned by the database is sent to the application, which does not need to perform secondary processing on multiple data tables, which not only improves the query efficiency of the table data, but also reduces the amount of data transmitted and improves the business performance. .
  • the embodiment of the present invention provides a table data query method based on the architecture diagram of the distributed data system involved in FIG. Referring to FIG. 3, the method process provided by the embodiment of the present invention includes:
  • the terminal sends a table data query request to the query node.
  • the application may trigger the terminal generation table.
  • the data query request and send the table data query request to the query node.
  • the table data query request may be a SQL (Structured Query Language) request.
  • SQL is a database query and programming language for storing and retrieving data from a database and querying, updating, and managing the database.
  • the query node When receiving the table data query request, the query node parses the table data query request to obtain query information, where the query information includes multiple data table identifiers and query parameters.
  • the table data query request generally includes multiple key fields, and the query nodes are based on the key fields.
  • the table data query request can be parsed to obtain query information.
  • the query information includes the operation type, the number of data tables involved, the data table identifier, and the query parameters. Operation types generally include operations such as adding, deleting, changing, and querying.
  • the query parameter is a basis for querying the data table, and includes at least a query partition key of the plurality of data tables, an association condition between the plurality of data tables, and the like, and the query partition key is a partition key obtained by parsing the data table query request.
  • T1.PARTITION_KEY is the partition key of the data table T1
  • T2.PARTITION_KEY is the partition key of the data table T2.
  • the query node generates a plurality of sub-data table query requests according to the query information and the table data query request.
  • the query node may generate a plurality of sub-table data query requests according to the plurality of data table identifiers and the data table query request.
  • the number of the generated sub-data table query requests is the same as the number of the data table identifiers, and each sub-table data query request corresponds to one data table, which can be used to query the table data of one data table.
  • the terminal sends a parameter value of the query parameter to the query node, where the parameter value of the query parameter includes at least a query partition key value of the multiple data tables.
  • the table data query request sent by the terminal to the query node only carries the query parameter, and according to the query request, the query node cannot query the table data that meets the requirement.
  • the application program also triggers the terminal to send a parameter value of the query parameter to the query node during the data table query process, and the parameter value of the query parameter includes multiple data tables. Query partition key values, etc.
  • the query node determines more according to the query partition key value of the multiple data tables.
  • the query partition of the data table is a table that lists the query partition keys value of the multiple data tables.
  • partition key values have various forms, including field values, hash values obtained by hashing field values, etc.
  • the query node is queried.
  • the correspondence between partition key values and partitions can be stored. Therefore, when receiving the parameter value of the query parameter, the query node may determine the query partition of the plurality of data tables from the correspondence between the partition key value and the partition according to the query partition key value of the plurality of data tables.
  • the query node determines whether the query partitions of the multiple data tables are the same. If yes, go to step 307. If no, go to step 310.
  • the query node determines whether all of the query partition keys of the plurality of data tables have an association relationship.
  • the association relationship between the query partition keys of the plurality of data tables refers to whether the query partition keys of the plurality of data tables are connected by using an equal symbol or the like, if the query partition keys of the multiple data tables are all By using an equal symbol or the like to connect, it is determined that the query partition keys of the plurality of data tables have an association relationship.
  • the query node determines that the query partitions of the multiple data tables are the same.
  • the query node When querying the partition key value according to the query parameter, it is determined that all the query partition keys of the plurality of data tables are connected by the associated symbol, for example, the query partition keys of the plurality of data tables are all connected by an equal sign, and the query node can determine more The query partitions of the data tables are the same.
  • the query node determines that the query partitions of the multiple data tables are the same.
  • the query partitions of multiple data tables are not all associated with each other. For example, the query partitions of multiple data tables have no association relationship, or some query partitions of multiple data tables have association relationships, and some query partitions have no association relationship.
  • the query node will determine whether the query partition key values of the multiple data tables are equal. If the query partition key values of the multiple data tables are equal and point to a unique database, the query node can determine that the query partitions of the multiple data tables are the same.
  • the query node determines that the query partitions of the multiple data tables are different.
  • the query partition keys of the multiple data tables are not all associated with each other, that is, the query partition keys of the plurality of data tables are not associated with each other, or the partial query partition keys of the plurality of data tables have an association relationship.
  • Some query partition keys do not have an association relationship, and the partition key values of multiple data tables are not all equal, that is, the query partition key values of multiple data tables are all unequal, or the partial query partition key values of multiple data tables are equal. If the partial query partition key values are not equal, it can be determined that the query partitions of multiple data tables are different.
  • the above determining the query partitions of the plurality of data tables are different, including the following cases: the query partition keys of the multiple data tables have no association relationship, and the query of the multiple data tables
  • the partition key values are all unequal, and it can be determined that the query partitions of multiple data tables are different; all the query partition keys of multiple data tables have no association relationship, and the partial query partition key values of multiple data tables are equal, partial query
  • the partition key values are not equal, and it can be determined that the query partitions of multiple data tables are different; the partial query keys of multiple data tables have an association relationship, the partial query partition keys do not have an association relationship, and the query of multiple data tables
  • the partition key values are all unequal, and it can be determined that the query partitions of the multiple data tables are different; the partial query partition keys of the multiple data tables have an association relationship, the partial query partition keys have no association relationship, and the plurality of data tables
  • the partial query partition key values are equal, and the partial query partition key values are not equal. It can be determined that the query partitions
  • the query node sends the table data query request to the target database.
  • the target database is the database corresponding to the same query partition.
  • the query node may send a table data query request to the target data node where the target database is located through the wired network or the wireless network.
  • the target data node performs a query according to the table data query request, and obtains a final query result.
  • the target data node When receiving the table data query request, the target data node queries the target database for the final query result according to the parameter value of the query parameter, and the final query result is the table data to be acquired by the application.
  • the target data node sends a final query result to the query node.
  • the target data node sends the queried final query result to the query node through a wired network or a wireless network.
  • the query node sends a plurality of sub-data table query requests that add a query partition key value to the plurality of partition databases.
  • the query node When it is determined by the above step 306 that the query partitions of the plurality of data tables are different, the query node adds the query partition key value to the sub-data table query request corresponding to the same data table identifier, and each data table is selected according to the query partition key value.
  • the query request is sent to the partition data node where the partition database corresponding to the partition key value is located.
  • the partition data node performs a query according to the sub-data table query request, and obtains an intermediate query result.
  • the partition data node of each partition database queries the sub-table stored in the local storage according to the sub-data table query request, and obtains an intermediate query result.
  • the intermediate query result includes a sub-table stored by the partition database or table data queried from the sub-table.
  • the specific content included in the intermediate query result varies according to the factor data table query request. If the sub-data table query request carries the query parameter for the sub-table, the intermediate query result queried by the partition data node is from the sub-table.
  • the queryed table data; if the sub-data table query request does not carry the query parameter of the sub-table, the intermediate query result queried by the partition data node is the sub-table stored in the partition database.
  • the partitioned data node sends the intermediate query result to the query node.
  • the partitioned data node may send the intermediate query result to the query node through a wired network or a wireless network.
  • the query node When receiving the intermediate query result sent by the multiple partition data nodes, the query node queries multiple intermediate query results according to the parameter values of the query parameters to obtain a final query result.
  • the terminal Since the intermediate query result returned by each partition data node is the query result for a certain data table, and the terminal acquires the table data satisfying certain association conditions in the plurality of data tables, when receiving the plurality of partition data
  • the intermediate query result sent by the node the query node queries the plurality of intermediate query results according to the parameter value of the query query parameter (actually according to the association condition between the plurality of data tables in the parameter value), and obtains the final query result.
  • the query node sends the final query result to the terminal.
  • the query node sends the final query result to the terminal through the wired network or the wireless network, and the terminal sends the application to the application.
  • Example 1 in the distributed database system, the data of the data table T1 and the data table T2 are stored in the database DB1 and the database DB2, and the application program needs to obtain the data table T1 and the data table T2 due to the business requirements during the running process.
  • the query node can determine the data table T1 and the data table T2. Whether the query partition key values are equal. Since the query partition key value of the data table T1 is ⁇ DB1 ⁇ , the data table T2 does not have a certain query partition key value. Therefore, it can be determined that the query partition key values of the data table T1 and the data table T2 are different.
  • the query node also sends SELECT*FROM T2 to the partition data node where the database DB1 and DB2 are located, and the sub-table of the stored data table T2 is returned by the partition data node where the database DB1 and the database DB2 are located, and the query node receives the database DB1 and the database.
  • T12.PARTITION_KEY AND T1.PARTITION_KEY ⁇ DB1 ⁇
  • the query node sends the query request SQL to the target data node where the database DB1 is located, and the target data node returns the final query result, and when the final query result is received, the query The node sends the final query result to the terminal, which is sent by the terminal to the application.
  • Example 3 In the distributed database system, the data of the data table T1 and the data table T2 are stored in the database DB1 and the database DB2, and the application program needs to obtain the data table T1 and the data table T2 due to the business requirement during the running process.
  • the query partition key value received by the query node is 1.
  • the query partition key value of the data table T1 and the data table T2 is ⁇ DB1 ⁇ , the query partition key value of the data table T2 is ⁇ DB1 ⁇ , and the two are equal, therefore, it can be determined
  • the query table of the data table T1 and the data table T2 are the same, the query node sends the query request SQL to the target data node where the database DB1 is located, and the final query result is returned by the target data node where the DB1 is located. When the final query result is received, the query node will finally query. The result is sent to the terminal and sent by the terminal to the application.
  • the method provided by the embodiment of the present invention after determining that the query partitions of the plurality of data tables are the same according to the query partition key values of the plurality of data tables, sending the received table data query request to the database corresponding to the query partition, and then The final query result returned by the database is sent to the application, and the process does not need to perform secondary processing on multiple data tables, which not only improves the query efficiency of the table data, but also reduces the amount of data transmitted and improves the service performance.
  • an embodiment of the present invention provides a table data query device, where the device includes:
  • the receiving module 401 is configured to receive a table data query request sent by the application
  • the parsing module 402 is configured to parse the table data query request to obtain query information, where the query information includes multiple data table identifiers and query parameters, and the query parameter includes at least a query partition key of the plurality of data tables;
  • the receiving module 401 is further configured to receive a parameter value of the query parameter sent by the application, where the parameter value includes at least a query partition key value of the plurality of data tables;
  • a determining module 403 configured to determine a query partition of the plurality of data tables according to the query partition key value of the plurality of data tables;
  • the determining module 404 is configured to determine whether the query partitions of the multiple data tables are the same;
  • the sending module 405 is configured to send the table data query request to the target database if the query partitions of the multiple data tables are the same, and return the final query result by the target database, where the target database is the database corresponding to the same query partition;
  • the sending module 405 is further configured to send the final query result to the application if the final query result is received.
  • the determining module 404 is further configured to determine whether all of the query partition keys of the plurality of data tables have an association relationship; if the query partition keys of the plurality of data tables all have an association relationship, Determining that the query partitions of the multiple data tables are the same; if the query partition keys of the multiple data tables are not all associated with each other, and the query partition key values of the multiple data tables are all equal, it is determined that the query partitions of the multiple data tables are the same; If the query partition keys of multiple data tables are not all associated, and the query partition key values of multiple data tables are not all equal, it is determined that the query partitions of multiple data tables are different.
  • the apparatus further includes:
  • the query request generating module 406 is configured to generate a plurality of sub-data table query requests according to the query information and the table data query request, the number of the sub-data table query requests is the same as the number of the data table identifiers, and each sub-data table query request is used for Query a data table.
  • a sending module 405 is further used for querying multiple data tables.
  • the partition is different, and multiple sub-data table query requests that add the query partition key value are sent to multiple partition databases, and the intermediate database results are returned by the partition database, which is the database corresponding to the query partition key value, and the intermediate query result includes the partition.
  • a child table stored in the database or a table data queried from the child table;
  • the receiving module 401 is further configured to receive an intermediate query result sent by multiple partition databases
  • the query module 407 is configured to query a plurality of intermediate query results according to the parameter values of the query parameters to obtain a final query result;
  • the sending module 405 is further configured to send the final query result to the application.
  • the device after determining the query partitions of multiple data tables according to the query partition key values of the multiple data tables, sends the received table data query request to the query partition.
  • the database sends the final query result returned by the database to the application, which does not need to perform secondary processing on multiple data tables, which not only improves the query efficiency of the table data, but also reduces the amount of data transmitted. Improve business performance.
  • table data query device provided by the foregoing embodiment queries the table data
  • only the division of the above functional modules is illustrated.
  • the function distribution may be completed by different functional modules as needed. That is, the internal structure of the table data query device and the computer device are divided into different functional modules to complete all or part of the functions described above.
  • the embodiment of the table data query device, the computer device, and the table data query method provided by the foregoing embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Abstract

本发明公开了一种表数据查询方法及装置,属于信息技术领域。该方法包括:对应用程序发送的表数据查询请求进行解析;接收应用程序发送的查询参数的参数值;根据多个数据表的查询分区键值,确定多个数据表的查询分区;当多个数据表的查询分区相同时,将表数据查询请求发送至目标数据库;将目标数据库返回的最终查询结果发送至应用程序。本发明根据多个数据表的查询分区键值确定出多个数据表的查询分区相同后,将所接收到的表数据查询请求发送至查询分区所对应的数据库,进而将该数据库所返回的最终查询结果发送至应用程序,该过程中无需对多个数据表进行二次处理,不仅提高了表数据的查询效率,而且减小了传输的数据量,提升了业务性能。

Description

表数据查询方法及装置
本申请要求于2016年8月31日提交中国专利局、申请号为201610799750.9,发明名称为“表数据查询方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及信息技术领域,特别涉及一种表数据查询方法及装置。
背景技术
在分布式数据库系统中,为了提高表数据访问速率,可按照分区键值将应用程序的数据表拆分为多个子表,每个子表存储在不同的数据库中。其中,分区键值包括字段值、以及字段值经过哈希算法计算得到的哈希值等。
基于分布式数据库系统中数据的存储形式,在应用程序的运行过程中,因业务需求需要查询满足一定条件的表数据时,应用程序可向DDS(Distributed Data Service,分布式数据服务)发送表数据查询请求,该表数据查询请求中携带多个数据表标识、查询参数,DDS根据每个数据表标识,从相应数据库中获取数据表标识对应的子表,在该过程中,应用程序还将向DDS发送查询参数的参数值,DDS根据查询参数的参数值,从多个数据表中筛选出满足一定条件的表数据,进而向应用程序发送筛选出的表数据。
在实现本发明的过程中,发明人发现现有技术至少存在以下问题:
现有的查询方式需要DDS对多个数据表进行二次处理,降低了表数据的查询效率,且整个查询过程传输的数据量较大,导致业务性能较差。
发明内容
为了解决现有技术的问题,本发明实施例提供了一种数据表查询方法及装置。所述技术方案如下:
第一方面,提供了一种数据表查询方法,该方法应用于安装有DDS的查询节点中,该方法包括:
应用程序在运行过程中,因业务需求需要从数据库中获取多个数据表的表数据时,应用程序可触发终端发送表数据查询请求,当接收到表数据查询请求,查询节点对接收到的表数据查询请求进行解析,得到包括数据表标识和查询参数的查询信息,该查询信息包括多个数据表的查询分区键和多个数据表的查询条件等。之后,应用程序还将触发终端发送查询参数的参数值,该参数值至少包括多个数据表的查询分区键值。查询节点根据多个数据表的查询分区键值,确定多个数据表的查询分区,进而判断多个数据表的查询分区是否相同,如果多个数据表的查询分区相同,则将接收到的表数据查询请求发送至相同的查询分区所对应的目标数据库,由目标数据库进行查询,并返回最终查询结果。如果接收到最终查询结果,查询节点将最终查询结果发送至终端,由终端发送至应用程序。在多个数据表的查询分区相同的情况下,通过将表数据查询请求发送到一个数据库,使得查询节点无需进行二次处理,即可获取到满足要求的表数据,提高了表数据的查询效率,提升了业务性能。
结合第一方面,在第一方面的第一种可能的实现方式中,查询节点在判断多个数据表的 查询分区是否相同时,可先判断多个数据表的查询分区键之间是否全部具有关联关系,如果多个数据表的查询分区键之间全部具有关联关系,则可确定多个数据表的查询分区相同,如果多个数据表的查询分区键之间未全部具有关联关系,但根据多个数据表的查询分区键值,确定出多个数据表的查询分区键值全部相等,则可确定多个数据表的查询分区相同;如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值未全部相等,确定多个数据表的查询分区不相同。通过该种判断方法,在进行表数据查询过程中,查询节点可快速确定所涉及的多个数据表的查询分区是否相同,以在后续步骤通过将接收到的表数据查询请求整体下沉到一个数据库,不仅提高查询效率,而且减少查询节点上所存储的数据量,节省了存储空间。
结合第一方面,在第一方面的第二种可能的实现方式中,查询节点在对表数据查询请求进行解析之后,接收到查询参数的参数值之前,还将根据查询信息和数据表查询请求,生成与数据表标识数量相同的子数据表查询请求,每个子数据表查询请求与一个数据表对应,可用于查询该数据表中的表数据。本实施例在确定多个数据表的查询分区不相同时,根据所生成的多个子数据表查询请求进行查询,确保无论在何种情况下,均可获取到满足要求的表数据。
结合第一方面至第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,查询节点在确定多个数据表的查询分区不相同时,可将查询分区键值添加到相应的子数据表查询请求中,并将添加了查询分区键值的多个数据表查询请求发送至多个分区数据库,由每个分区数据库进行查询,并返回中间查询结果,该分区数据库为查询分区键值所对应的数据库,该中间查询结果包括分区数据库所存储的子表或从子表中查询到的表数据。当接收到多个分区数据库发送的中间查询结果,查询节点根据查询参数的参数值,对多个中间查询结果进行查询,可得到最终查询结果,进而将最终查询结果发送至终端,由终端发送至应用程序,从而在多个数据表的查询分区不相同的情况下,也可查询到满足要求的表数据,提高了查询的可靠性。
第二方面,提供了一种数据表查询装置,该装置用于执行上述第一方面所述的数据表查询方法。
第三方面,提供了一种计算机设备,该计算设备用于执行上述第一方面所述的数据表查询方法。
本发明实施例提供的技术方案带来的有益效果是:
根据多个数据表的查询分区键值确定出多个数据表的查询分区相同后,将所接收到的表数据查询请求发送至查询分区所对应的数据库,进而将该数据库所返回的最终查询结果发送至应用程序,该过程中无需对多个数据表进行二次处理,不仅提高了表数据的查询效率,而且减小了传输的数据量,提升了业务性能。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明一个实施例提供的分布式数据库系统的架构图;
图2是本发明实施例提供的一种计算机设备的说明性计算机体系结构;
图3是本发明另一个实施例提供的一种数据表查询方法的流程图;
图4是本发明另一个实施例提供的一种数据表查询装置的结构示意图;
图5是本发明另一个实施例提供的一种数据表查询装置的结构示意图;
图6是本发明另一个实施例提供的一种数据表查询装置的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明实施方式作进一步地详细描述。
参见图1,其示出了分布式数据库系统的架构图,该分布式数据库系统包括终端101、查询节点102及数据节点103。
其中,终端101可以为智能手机、平板电脑、台式电脑等,本实施例不对终端101的产品类型作具体的限定。为了满足用户的使用需求,终端101中安装了各种应用程序,例如购物类应用程序、导航类应用程序、即时通讯类应用程序等等。对于终端101中所安装的任一应用程序,当该应用程序在运行过程中,需要获取满足要求的表数据时,该应用程序可触发终端101向查询节点102发送表数据查询请求,由查询节点102根据表数据查询请求进行查询,并接收查询节点102发送的查询结果。
查询节点102上安装有DDS,该DDS用于提供分布式数据访问服务,可接收终端101发送的表数据查询请求,并从数据节点103的数据库中查询数据。在实际应用中,查询节点102可以为单独的一台计算设备,还可以为由多台计算设备组成的计算机集群。
数据节点103维护一个数据库,该数据库可用于存储应用程序的数据表。在实际应用中,数据节点103可以为单独的一台计算设备,还可以为由多台计算设备组成的计算机集群。
上述终端101与查询节点102之间可通过有线网络或无线网络进行通信,查询节点102与数据节点103之间可通过有线网络或无线网络进行通信。
参见图2,其示出了本发明的一个实施例中使用的计算设备200的说明算机体系结构。所述计算设备200为常规的台式计算机或者膝上型笔记本,一个或多个计算设备200可构成物理平台。所述计算设备200包括处理器201、存储器202、通信接口203和总线204。该处理器201、存储器202、通信接口203通过总线204直连。该计算设备200可用于执行数据表查询方法。具体地,
存储器202,用于存放计算机指令;
处理器201通过总线204调用存储器202中存储的计算机指令,用于执行以下操作:
对表数据查询请求进行解析,得到查询信息,该查询信息包括多个数据表标识和查询参数,该查询参数至少包括多个数据表的查询分区键;
通过调用通信接口203接收应用程序发送的查询参数的参数值,该参数值至少包括多个数据表的查询分区键值;
根据多个数据表的查询分区键值,确定多个数据表的查询分区;
判断多个数据表的查询分区是否相同;
如果多个数据表的查询分区相同,通过调用通信接口203将表数据查询请求发送至目标 数据库,由目标数据库返回最终查询结果,目标数据库为相同的查询分区所对应的数据库;
如果接收到最终查询结果,通过调用通信接口203将最终查询结果发送至应用程序。
在本发明的另一个实施例中,处理器201通过总线204调用存储器202中存储的计算机指令,还用于执行以下操作:
如果多个数据表的查询分区键之间全部具有关联关系,确定多个数据表的查询分区相同;
如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值全部相等,确定多个数据表的查询分区相同;
如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值未全部相等,确定多个数据表的查询分区不相同。
在本发明的另一个实施例中,处理器201通过总线204调用存储器202中存储的计算机指令,还用于执行以下操作:
根据查询信息和表数据查询请求,生成多个子数据表查询请求,该子数据表查询请求的数量与数据表标识的数量相同,且每个子数据表查询请求用于查询一个数据表。
在本发明的另一个实施例中,处理器201通过总线204调用存储器202中存储的计算机指令,还用于执行以下操作:
如果多个数据表的查询分区不相同,通过调用通信接口203将添加查询分区键值的多个子数据表查询请求发送至多个分区数据库,由分区数据库返回中间查询结果,该分区数据库为查询分区键值所对应的数据库,该中间查询结果包括分区数据库所存储的子表或从子表中查询到的表数据;
通过调用通信接口203接收多个分区数据库发送的中间查询结果;
根据查询参数的参数值,对多个中间查询结果进行查询,得到最终查询结果;
通过调用通信接口203将最终查询结果发送至应用程序。
不失一般性,该存储器202包括计算机存储介质。计算机存储介质包括以用于存储诸如计算机可读指令、数据结构、程序模块或其他数据等信息的任何方法或技术实现的易失性和非易失性、可移动和不可移动介质。计算机存储介质包括RAM、ROM、EPROM、EEPROM、闪存或其他固态存储其技术,CD-ROM、DVD或其他光学存储、磁带盒、磁带、磁盘存储或其他磁性存储设备。当然,本领域技术人员可知计算机存储介质不局限于上述几种。
根据本发明的各种实施例,所述计算设备200还可以通过诸如因特网等网络连接到网络上的远程计算机运行。也即计算设备200可以通过连接在所述总线204上的网络接口单元205连接到网络,或者说,也可以使用网络接口单元205来连接到其他类型的网络或远程计算机系统(未示出)。
本发明实施例提供的计算机设备,根据多个数据表的查询分区键值确定出多个数据表的查询分区相同后,将所接收到的表数据查询请求发送至查询分区所对应的数据库,进而将该数据库所返回的最终查询结果发送至应用程序,该过程中无需对多个数据表进行二次处理,不仅提高了表数据的查询效率,而且减小了传输的数据量,提升了业务性能。
为了提高表数据的查询效率,减少表数据查询过程所传输的数据量,提升业务性能,基于图1所涉及的分布式数据系统的架构图,本发明实施例提供了一种表数据查询方法,参见图3,本发明实施例提供的方法流程包括:
301、终端向查询节点发送表数据查询请求。
在一种可能的实现方式中,本领域技术人员应当理解,因业务需求,在应用程序运行过程中,当需要从多个数据表中获取满足要求的表数据时,应用程序可触发终端生成表数据查询请求,并将表数据查询请求发送至查询节点。其中,表数据查询请求可以为SQL(Structured Query Language,结构化查询语言)请求。SQL为一种数据库查询和程序设计语言,用于从数据库中存储、获取数据,并对数据库进行查询、更新和管理。
302、当接收到表数据查询请求时,查询节点对表数据查询请求进行解析,得到查询信息,该查询信息包括多个数据表标识和查询参数。
由于分布式数据库系统中存储了大量的数据表,为了便于从分布式数据库系统中查询满足要求的表数据,表数据查询请求中一般会包括多个关键字段,根据这些关键字段,查询节点可对表数据查询请求进行解析,得到查询信息。其中,查询信息包括操作类型、所涉及的数据表数量、数据表标识及查询参数等。操作类型一般包括增、删、改、查询等操作。查询参数为对数据表进行查询的依据,至少包括多个数据表的查询分区键、多个数据表之间的关联条件等,该查询分区键为对数据表查询请求解析所得到的分区键。
例如,查询节点接收到的表数据查询请求为SQL:SELECT*FROM T1WHERE T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY=`?`AND T2.PARTITION_KEY=`?`,通过对该表数据查询请求进行解析,得到的查询信息为:操作类型为查询、所涉及的数据表数量为两个、两个数据表标识为T1和T2、查询参数为T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY=`?`AND T2.PARTITION_KEY=`?`,其中,T1.ITEM_ID=T2.ITEM_ID为数据表T1和数据表T2之间的关联条件,T1.PARTITION_KEY为数据表T1的分区键,T2.PARTITION_KEY为数据表T2的分区键。
303、查询节点根据查询信息和表数据查询请求,生成多个子数据表查询请求。
基于解析得到的查询信息,查询节点可根据多个数据表标识和数据表查询请求,生成多个子表数据查询请求。其中,所生成的子数据表查询请求的数量与数据表标识的数量相同,且每个子表数据查询请求对应一个数据表,可用于查询一个数据表的表数据。
例如,查询节点接收到的表数据查询请求为SQL:SELECT*FROM T1WHERE T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY=`?`AND T2.PARTITION_KEY=`?`,查询节点对该表数据查询请求进行解析,得到查询信息。根据接收到的数据表查询请求及数据表标识T1、T2,查询节点生成两个子数据表查询请求,分别为SELECT*FROM T1WHERE T1.PARTITION_KEY=`?`和SELECT*FROM T2WHERE T2.PARTITION_KEY=`?`。如果数据表T1按照分区键值拆分为三个子表,分别存储在数据库DB1、数据库DB2及数据库DB3中,则在进行数据表查询时,SELECT*FROM T1WHERE T1.PARTITION_KEY=`?`可分别从数据库DB1、数据库DB2及数据库DB3中查询数据表T1的三个子表。
304、终端向查询节点发送查询参数的参数值,该查询参数的参数值至少包括多个数据表的查询分区键值。
在本实施例中,终端向查询节点发送的表数据查询请求仅携带查询参数,根据该查询请求,查询节点无法查询出满足要求的表数据。为了能够有针对性地查询到满足要求的表数据,在进行数据表查询过程中,应用程序还将触发终端向查询节点发送查询参数的参数值,该查询参数的参数值包括多个数据表的查询分区键值等。
305、当接收到查询参数的参数值,查询节点根据多个数据表的查询分区键值,确定多 个数据表的查询分区。
在分布式数据库系统中,分区键值有多种形式,包括字段值、对字段值进行哈希计算得到的哈希值等,为了便于后续进行查询,在对数据表进行分区存储时,查询节点可存储分区键值与分区之间的对应关系。因此,当接收到查询参数的参数值时,查询节点可根据多个数据表的查询分区键值,从分区键值与分区之间的对应关系中,确定多个数据表的查询分区。
306、查询节点判断多个数据表的查询分区是否相同,如果是,执行步骤307,如果否,执行步骤310。
查询节点在判断多个数据表的查询分区是否相同时,可采用如下步骤3061-3064:
3061、查询节点判断多个数据表的查询分区键之间是否全部具有关联关系。
在本实施例中,多个数据表的查询分区键之间具有关联关系是指多个数据表的查询分区键是否采用等号等关联符号连接,如果多个数据表的查询分区键之间全部采用等号等关联符号连接,则可确定多个数据表的查询分区键之间具有关联关系。
3062、如果多个数据表的查询分区键之间全部具有关联关系,查询节点确定多个数据表的查询分区相同。
当根据查询参数中查询分区键值,确定出多个数据表的查询分区键之间全部采用关联符号相连,如多个数据表的查询分区键之间全部采用等号连接,查询节点可确定多个数据表的查询分区相同。
例如,查询节点对接收到的数据表查询请求进行解析,得到数据表T1的分区键和数据表T2的分区键,接收到应用程序发送的查询参数的参数值中查询分区键值为T1.PARTITION_KEY=T2.PARTITION_KEY。由于数据表T1和数据表T2的查询分区键之间全部具有关联关系,因此,可确定数据表T1和数据表T2的查询分区相同。
3063、如果多个数据表的查询分区之间未全部具有关联关系,且多个数据表的查询分区键值相等,查询节点确定多个数据表的查询分区相同。
在多个数据表的查询分区之间未全部具有关联关系,如多个数据表的查询分区全部不具有关联关系,或者多个数据表的部分查询分区具有关联关系、部分查询分区不具有关联关系,查询节点将判断多个数据表的查询分区键值是否相等,如果多个数据表的查询分区键值相等,且指向唯一的数据库,查询节点可确定多个数据表的查询分区相同。
例如,查询节点对接收到的数据表查询请求进行解析,得到数据表1的查询分区键为T1.PARTITION_KEY=`?`,数据表2的查询分区键为T2.PARTITION_KEY=`?`,接收到应用程序发送的查询参数的参数值中数据表T1的分区键值为DB1,数据表T2的分区键值为DB1,虽然数据表T1的查询分区键和数据表T2的查询分区键之间不具有关联关系,但由于数据表T1和数据表T2的查询分区键值均为DB1,因此,可确定多个数据表的查询分区键值相同。
3064、如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值未全部相等,查询节点确定多个数据表的查询分区不相同。
如果多个数据表的查询分区键之间未全部具有关联关系,即多个数据表的查询分区键之间全部不具有关联关系,或者多个数据表的部分查询分区键之间具有关联关系、部分查询分区键之间不具有关联关系,且多个数据表的分区键值未全部相等,即多个数据表的查询分区键值全部不相等,或者多个数据表的部分查询分区键值相等、部分查询分区键值不相等,则可确定多个数据表的查询分区不相同。总的来说,上述确定多个数据表的查询分区不相同包括以下几种情况:多个数据表的查询分区键之间全部不具有关联关系,且多个数据表的查询 分区键值全部不相等,可确定多个数据表的查询分区不相同;多个数据表的查询分区键之间全部不具有关联关系,且多个数据表的部分查询分区键值相等、部分查询分区键值不相等,可确定多个数据表的查询分区不相同;多个数据表的部分查询键之间具有关联关系、部分查询分区键之间不具有关联关系,且多个数据表的查询分区键值全部不相等,可确定多个数据表的查询分区不相同;多个数据表的部分查询分区键之间具有关联关系、部分查询分区键之间不具有关联关系,且多个数据表的部分查询分区键值相等、部分查询分区键值不相等,可确定多个数据表的查询分区不相同。
307、查询节点将表数据查询请求发送至目标数据库。
其中,目标数据库为相同的查询分区所对应的数据库。当确定多个数据表的查询分区相同时,查询节点可通过有线网络或无线网络向目标数据库所在的目标数据节点发送表数据查询请求。
308、目标数据节点根据表数据查询请求进行查询,得到最终查询结果。
当接收到表数据查询请求,目标数据节点根据查询参数的参数值,从目标数据库中查询满足要求的表数据,得到最终查询结果,该最终查询结果即为应用程序所要获取的表数据。
309、目标数据节点向查询节点发送最终查询结果。
目标数据节点将查询到的最终查询结果通过有线网络或无线网络发送至查询节点。
310、查询节点将添加查询分区键值的多个子数据表查询请求发送至多个分区数据库。
当通过上述步骤306确定多个数据表的查询分区不相同,查询节点将查询分区键值添加到对应同一数据表标识的子数据表查询请求中,并根据查询分区键值,将每个数据表查询请求发送至该分区键值所对应的分区数据库所在的分区数据节点。
311、分区数据节点根据子数据表查询请求进行查询,得到中间查询结果。
每个分区数据库所在分区数据节点接收到子数据表查询请求后,根据该子数据表查询请求,对本地存储器中所存储的子表进行查询,得到中间查询结果。该中间查询结果包括分区数据库所存储的子表或从子表中查询到的表数据。该中间查询结果中所包括的具体内容,因子数据表查询请求而异,如果子数据表查询请求中携带对子表的查询参数,则分区数据节点所查询到的中间查询结果为从子表中查询到的表数据;如果子数据表查询请求中未携带对子表的查询参数,则分区数据节点所查询到的中间查询结果为分区数据库中所存储的子表。
312、分区数据节点将中间查询结果发送至查询节点。
当获取到中间查询结果后,分区数据节点可通过有线网络或无线网络将中间查询结果发送至查询节点。
313、当接收到多个分区数据节点发送的中间查询结果,查询节点根据查询参数的参数值,对多个中间查询结果进行查询,得到最终查询结果。
由于每个分区数据节点所返回的中间查询结果为针对某一数据表的查询结果,而终端所要获取的是多个数据表中满足一定关联条件的表数据,因此,当接收到多个分区数据节点发送的中间查询结果,查询节点根据查询查询参数的参数值(实际上主要根据参数值中的多个数据表之间的关联条件),对多个中间查询结果进行查询,得到最终查询结果。
314、查询节点将最终查询结果发送至终端。
无论获取的是目标数据节点发送的最终查询结果,还是对多个中间查询结果进行查询得到的最终查询结果,查询节点都会通过有线网络或者无线网络将最终查询结果发送至终端,由终端发送至应用程序,以使应用程序可根据该最终查询结果,为用户提供相应的服务。通 过采用该种处理方式,查询节点无需对接收到的查询节点进行二次处理,不仅节省了查询节点上的存储空间,而且大大提升了查询效率。
对于上述表数据查询方法,为了便于理解,下面将以几个具体的例子进行说明。
例1,在分布式数据库系统中,数据表T1和数据表T2的数据均存储在数据库DB1和数据库DB2中,应用程序在运行过程中,因业务需求需要获取数据表T1和数据表T2中满足一定要求的表数据,应用程序可触发终端向查询节点发送SQL请求:SELECT*FROM T1WHERE T1.ITEM_ID?T2.ITEM_ID AND T1.PARTITION_KEY=`?`,查询节点对SQL请求进行解析,解析出所涉及的数据表数量为两个、两个数据表标识为T1和T2、查询参数为T1.ITEM_ID、T2.ITEM_ID、T1.PARTITION_KEY=`?`。查询节点根据数据表标识和数据表查询请求,生成两个子表数据查询请求,分别为SELECT*FROM T1WHERE T1.PARTITION_KEY=`?`与SELECT*FROM T2。在进行表数据查询过程,查询节点接收到的参数值中查询分区键值为T1.PARTITION_KEY=`DB1`,根据查询分区键值,查询节点确定数据表T1和T2的查询分区。虽然数据表T1和数据表T2的查询分区键之间不具有关联关系,但查询节点无法判断数据表T1和数据表T2的查询分区是否相同,此时查询节点可判断数据表T1和数据表T2的查询分区键值是否相等。由于数据表T1的查询分区键值为`DB1`,数据表T2未有确定的查询分区键值,因此,可确定数据表T1和数据表T2的查询分区键值不相同。查询节点将SELECT*FROM T1WHERE T1.PARTITION_KEY=`DB1发送到数据库DB1所在的分区数据节点,由数据库DB1所在的分区数据节点返回所存储的数据表T1的子表。查询节点还将SELECT*FROM T2分别发送到数据库DB1和DB2所在的分区数据节点,由数据库DB1和数据库DB2所在的分区数据节点返回所存储的数据表T2的子表,查询节点接收数据库DB1和数据库DB2所在分区数据节点返回的子表,并根据T1.ITEM_ID=T2.ITEM_ID,对接收到的所有子表进行查询,得到最终查询结果,进而将最终查询结果发送至终端,由终端发送至应用程序。
例2,在分布式数据库系统中,数据表T1和数据表T2的数据均存储在数据库DB1和数据库DB2中,应用程序在运行过程中,因业务需求需要获取数据表T1和数据表T2中满足一定条件的表数据,应用程序可触发终端向查询节点发送SQL请求:SELECT*FROM T1WHERE T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY?T12.PARTITION_KEY AND T1.PARTITION_KEY=`DB1`,查询节点对SQL请求进行解析,解析出所涉及的数据表数量为两个、两个数据表标识为T1和T2、查询参数为T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY?T12.PARTITION_KEY AND T1.PARTITION_KEY=`?`。查询节点根据数据表标识和数据表查询请求,生成两个子表数据查询请求,分别为SELECT*FROM T1WHERE T1.PARTITION_KEY=`DB1?`与SELECT*FROM T2。在进行表数据查询过程,查询节点接收到查询分区键值为1.PARTITION_KEY=T12.PARTITION_KEY,由于数据表T1和数据表T2的查询分区键T1.PARTITION_KEY=T12.PARTITION_KEY之间具有关联关系,因此,可确定数据表T1和数据表T2的查询分区相同,此时查询节点将查询请求SQL发送至数据库DB1所在的目标数据节点,由目标数据节点返回最终查询结果,当接收到最终查询结果,查询节点将最终查询结果发送至终端,由终端发送至应用程序。
例3,在分布式数据库系统中,数据表T1和数据表T2的数据均存储在数据库DB1和数据库DB2中,应用程序在运行过程中,因业务需求需要获取数据表T1和数据表T2中满足一定条件的表数据,应用程序可触发终端向查询节点发送SQL请求:SELECT*FROM T1WHERE T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY=`?`AND T2.PARTITION_KEY=`?`。查 询节点对SQL请求进行解析,解析出所涉及的数据表数量为两个、两个数据表标识为T1和T2、查询参数为T1.ITEM_ID=T2.ITEM_ID AND T1.PARTITION_KEY=`?`AND T2.PARTITION_KEY=`?`。查询节点根据数据表标识和数据表查询请求,生成两个子表数据查询请求,分别为SELECT*FROM T1WHERE T1.PARTITION_KEY=`?`与SELECT*FROM T2WHERE T1.PARTITION_KEY=`?`。在进行表数据查询过程,查询节点接收到的查询分区键值为1.PARTITION_KEY=`DB1`、T12.PARTITION_KEY=`DB1`,由于数据表T1的查询分区键为T1.PARTITION_KEY,数据表T2的查询分区键为T2.PARTITION_KEY,数据表T1和数据表T2的查询分区键之间不具有关联关系,查询节点无法判断数据表T1和数据表T2的查询分区是否相同,此时查询节点需要继续判断数据表T1和数据表T2的查询分区键值是否相等,由于数据表T1的查询分区键值为`DB1`,数据表T2的查询分区键值为`DB1`,二者相等,因此,可确定数据表T1和数据表T2的查询分区相同,查询节点将查询请求SQL发送至数据库DB1所在目标数据节点,由DB1所在目标数据节点返回最终查询结果,当接收到最终查询结果,查询节点将最终查询结果发送至终端,由终端发送至应用程序。
本发明实施例提供的方法,根据多个数据表的查询分区键值确定出多个数据表的查询分区相同后,将所接收到的表数据查询请求发送至查询分区所对应的数据库,进而将该数据库所返回的最终查询结果发送至应用程序,该过程中无需对多个数据表进行二次处理,不仅提高了表数据的查询效率,而且减小了传输的数据量,提升了业务性能。
参见图4,本发明实施例提供了一种表数据查询装置,该装置包括:
接收模块401,用于接收应用程序发送的表数据查询请求;
解析模块402,用于对表数据查询请求进行解析,得到查询信息,查询信息包括多个数据表标识和查询参数,查询参数至少包括多个数据表的查询分区键;
接收模块401,还用于接收应用程序发送的查询参数的参数值,参数值至少包括多个数据表的查询分区键值;
确定模块403,用于根据多个数据表的查询分区键值,确定多个数据表的查询分区;
判断模块404,用于判断多个数据表的查询分区是否相同;
发送模块405,用于如果多个数据表的查询分区相同,将表数据查询请求发送至目标数据库,由目标数据库返回最终查询结果,目标数据库为相同的查询分区所对应的数据库;
发送模块405,还用于如果接收到最终查询结果,将最终查询结果发送至应用程序。
在本发明的另一个实施例中,判断模块404,还用于判断多个数据表的查询分区键之间是否全部具有关联关系;如果多个数据表的查询分区键之间全部具有关联关系,确定多个数据表的查询分区相同;如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值全部相等,确定多个数据表的查询分区相同;如果多个数据表的查询分区键之间未全部具有关联关系,且多个数据表的查询分区键值未全部相等,确定多个数据表的查询分区不相同。
参见图5,在本发明的另一个实施例中,该装置还包括:
查询请求生成模块406,用于根据查询信息和表数据查询请求,生成多个子数据表查询请求,该子数据表查询请求的数量与数据表标识的数量相同,且每个子数据表查询请求用于查询一个数据表。
参见图6,在本发明的另一个实施例中,发送模块405,还用于如果多个数据表的查询 分区不相同,将添加查询分区键值的多个子数据表查询请求发送至多个分区数据库,由分区数据库返回中间查询结果,该分区数据库为查询分区键值所对应的数据库,该中间查询结果包括分区数据库所存储的子表或从子表中查询到的表数据;
接收模块401,还用于接收多个分区数据库发送的中间查询结果;
查询模块407,用于根据查询参数的参数值,对多个中间查询结果进行查询,得到最终查询结果;
发送模块405,还用于将最终查询结果发送至应用程序。
综上所述,本发明实施例提供的装置,根据多个数据表的查询分区键值确定出多个数据表的查询分区相同后,将所接收到的表数据查询请求发送至查询分区所对应的数据库,进而将该数据库所返回的最终查询结果发送至应用程序,该过程中无需对多个数据表进行二次处理,不仅提高了表数据的查询效率,而且减小了传输的数据量,提升了业务性能。
需要说明的是:上述实施例提供的表数据查询装置在查询表数据时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将表数据查询装置与计算机设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的表数据查询装置、计算机设备与表数据查询方法实施例属于同一构思,其具体实现过程详见方法实施例,这里不再赘述。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (12)

  1. 一种表数据查询方法,其特征在于,所述方法包括:
    接收应用程序发送的表数据查询请求;
    对所述表数据查询请求进行解析,得到查询信息,所述查询信息包括多个数据表标识和查询参数,所述查询参数至少包括多个数据表的查询分区键;
    接收所述应用程序发送的查询参数的参数值,所述参数值至少包括所述多个数据表的查询分区键值;
    根据所述多个数据表的查询分区键值,确定所述多个数据表的查询分区;
    判断所述多个数据表的查询分区是否相同;
    如果所述多个数据表的查询分区相同,将所述表数据查询请求发送至目标数据库,由所述目标数据库返回最终查询结果,所述目标数据库为相同的所述查询分区所对应的数据库;
    如果接收到所述最终查询结果,将所述最终查询结果发送至所述应用程序。
  2. 根据权利要求1所述的方法,其特征在于,所述判断所述多个数据表的查询分区是否相同,包括:
    判断所述多个数据表的查询分区键之间是否全部具有关联关系;
    如果所述多个数据表的查询分区键之间全部具有关联关系,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值全部相等,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值未全部相等,确定所述多个数据表的查询分区不相同。
  3. 根据权利要求1所述的方法,其特征在于,所述接收所述应用程序发送的查询参数的参数值之前,所述方法还包括:
    根据所述查询信息和所述表数据查询请求,生成多个子数据表查询请求,所述子数据表查询请求的数量与所述数据表标识的数量相同,且每个子数据表查询请求用于查询一个数据表。
  4. 根据权利要求1-3中任一项所述的方法,其特征在于,如果所述多个数据表的查询分区不相同,所述方法还包括:
    将添加查询分区键值的所述多个子数据表查询请求发送至多个分区数据库,由所述分区数据库返回中间查询结果,所述分区数据库为查询分区键值所对应的数据库,所述中间查询结果包括所述分区数据库所存储的子表或从子表中查询到的表数据;
    接收所述多个分区数据库发送的中间查询结果;
    根据所述查询参数的参数值,对多个中间查询结果进行查询,得到所述最终查询结果;
    将所述最终查询结果发送至所述应用程序。
  5. 一种表数据查询装置,其特征在于,所述装置包括:
    接收模块,用于接收应用程序发送的表数据查询请求;
    解析模块,用于对所述表数据查询请求进行解析,得到查询信息,所述查询信息包括多个数据表标识和查询参数,所述查询参数至少包括多个数据表的查询分区键;
    所述接收模块,还用于接收所述应用程序发送的查询参数的参数值,所述参数值至少包括所述多个数据表的查询分区键值;
    确定模块,用于根据所述多个数据表的查询分区键值,确定所述多个数据表的查询分区;
    判断模块,用于判断所述多个数据表的查询分区是否相同;
    发送模块,用于如果所述多个数据表的查询分区相同,将所述表数据查询请求发送至目标数据库,由所述目标数据库返回最终查询结果,所述目标数据库为相同的所述查询分区所对应的数据库;
    所述发送模块,还用于如果接收到所述最终查询结果,将所述最终查询结果发送至所述应用程序。
  6. 根据权利要求5所述的装置,其特征在于,所述判断模块,用于判断所述多个数据表的查询分区键之间是否全部具有关联关系;如果所述多个数据表的查询分区键之间全部具有关联关系,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值全部相等时,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值未全部相等,确定所述多个数据表的查询分区不相同。
  7. 根据权利要求5所述的装置,其特征在于,所述装置还包括:
    查询请求生成模块,用于根据所述查询信息和所述表数据查询请求,生成多个子数据表查询请求,所述子数据表查询请求的数量与所述数据表标识的数量相同,且每个子数据表查询请求用于查询一个数据表。
  8. 根据权利要求5-7中任一项所述的装置,其特征在于,所述发送模块,还用于如果所述多个数据表的查询分区不相同,将添加查询分区键值的所述多个子数据表查询请求发送至多个分区数据库,由所述分区数据库返回中间查询结果,所述分区数据库为查询分区键值所对应的数据库,所述中间查询结果包括所述分区数据库所存储的子表或从子表中查询到的表数据;
    所述接收模块,还用于接收所述多个分区数据库发送的中间查询结果;
    查询模块,用于根据所述查询参数的参数值,对多个中间查询结果进行查询,得到所述最终查询结果;
    所述发送模块,还用于将所述最终查询结果发送至所述应用程序。
  9. 一种计算设备,其特征在于,包括:处理器、存储器、通信接口及总线,其中,所述存储器、所述处理器及所述通信接口通过所述总线连接;
    所述存储器,用于存放计算机指令;
    所述处理器通过所述总线调用所述存储器中存储的计算机指令,用于执行以下操作:
    通过调用所述通信接口接收应用程序发送的表数据查询请求;
    对所述表数据查询请求进行解析,得到查询信息,所述查询信息包括多个数据表标识和查询参数,所述查询参数至少包括多个数据表的查询分区键;
    通过调用所述通信接口接收所述应用程序发送的查询参数的参数值,所述参数值至少包括所述多个数据表的查询分区键值;
    根据所述多个数据表的查询分区键值,确定所述多个数据表的查询分区;
    判断所述多个数据表的查询分区是否相同;
    如果所述多个数据表的查询分区相同,通过调用所述通信接口将所述表数据查询请求发送至目标数据库,由所述目标数据库返回最终查询结果,所述目标数据库为相同的所述查询分区所对应的数据库;
    如果接收到所述最终查询结果,通过调用所述通信接口将所述最终查询结果发送至所述应用程序。
  10. 根据权利要求9所述的计算设备,其特征在于,所述处理器通过所述总线调用所述存储器中存储的计算机指令,还用于执行以下操作:
    如果所述多个数据表的查询分区键之间全部具有关联关系,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值全部相等,确定所述多个数据表的查询分区相同;
    如果所述多个数据表的查询分区键之间未全部具有关联关系,且所述多个数据表的查询分区键值未全部相等,确定所述多个数据表的查询分区不相同。
  11. 根据权利要求9所述的计算设备,其特征在于,所述处理器通过所述总线调用所述存储器中存储的计算机指令,还用于执行以下操作:
    根据所述查询信息和所述表数据查询请求,生成多个子数据表查询请求,所述子数据表查询请求的数量与所述数据表标识的数量相同,且每个子数据表查询请求用于查询一个数据表。
  12. 根据权利要求9-11中任一项所述的计算设备,其特征在于,所述处理器通过所述总线调用所述存储器中存储的计算机指令,还用于执行以下操作:
    如果所述多个数据表的查询分区不相同,通过调用所述通信接口将添加查询分区键值的所述多个子数据表查询请求发送至多个分区数据库,由所述分区数据库返回中间查询结果,所述分区数据库为查询分区键值所对应的数据库,所述中间查询结果包括所述分区数据库所存储的子表或从子表中查询到的表数据;
    通过调用所述通信接口接收所述多个分区数据库发送的中间查询结果;
    根据所述查询参数的参数值,对多个中间查询结果进行查询,得到所述最终查询结果;
    通过调用所述通信接口将所述最终查询结果发送至所述应用程序。
PCT/CN2017/091217 2016-08-31 2017-06-30 表数据查询方法及装置 WO2018040722A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610799750.9 2016-08-31
CN201610799750.9A CN107784044B (zh) 2016-08-31 2016-08-31 表数据查询方法及装置

Publications (1)

Publication Number Publication Date
WO2018040722A1 true WO2018040722A1 (zh) 2018-03-08

Family

ID=61300139

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/091217 WO2018040722A1 (zh) 2016-08-31 2017-06-30 表数据查询方法及装置

Country Status (2)

Country Link
CN (1) CN107784044B (zh)
WO (1) WO2018040722A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254966A (zh) * 2018-08-23 2019-01-22 平安科技(深圳)有限公司 数据表查询方法、装置、计算机设备及存储介质
CN111639140A (zh) * 2020-06-08 2020-09-08 杭州复杂美科技有限公司 分布式数据存储方法、设备和存储介质
CN111651424A (zh) * 2020-06-10 2020-09-11 中国科学院深圳先进技术研究院 一种数据处理方法、装置、数据节点及存储介质
CN111708848A (zh) * 2020-06-12 2020-09-25 北京思特奇信息技术股份有限公司 一种数据查询方法、系统及电子设备
CN112182028A (zh) * 2020-09-29 2021-01-05 北京人大金仓信息技术股份有限公司 基于分布式数据库的表的数据行数查询方法和装置
CN112541057A (zh) * 2019-09-04 2021-03-23 上海晶赞融宣科技有限公司 分布式新词发现方法、装置、计算机设备和存储介质
CN113568924A (zh) * 2021-07-23 2021-10-29 北京达佳互联信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN113760981A (zh) * 2021-01-13 2021-12-07 北京京东乾石科技有限公司 一种数据查询方法和装置
CN115292356A (zh) * 2022-07-21 2022-11-04 中电金信软件有限公司 数据查询方法、装置及电子设备
CN113568924B (zh) * 2021-07-23 2024-05-14 北京达佳互联信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108959510B (zh) * 2018-06-27 2022-04-19 北京奥星贝斯科技有限公司 一种分布式数据库的分区级连接方法和装置
CN110874383B (zh) * 2018-08-30 2023-05-05 阿里云计算有限公司 数据处理方法、装置及电子设备
CN109582696B (zh) * 2018-10-09 2023-07-04 北京奥星贝斯科技有限公司 扫描任务的生成方法及装置、电子设备
CN110287213B (zh) * 2019-07-03 2023-02-17 中通智新(武汉)技术研发有限公司 基于olap系统的数据查询方法、装置及系统
WO2022006794A1 (en) * 2020-07-08 2022-01-13 Alibaba Group Holding Limited Routing directives for partitioned databases
CN112948382A (zh) * 2021-02-26 2021-06-11 平安科技(深圳)有限公司 基于大数据的信息处理方法、装置及相关设备

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831120A (zh) * 2011-06-15 2012-12-19 腾讯科技(深圳)有限公司 一种数据处理方法及系统
CN103995879A (zh) * 2014-05-27 2014-08-20 华为技术有限公司 基于olap系统的数据查询方法、装置及系统
US20150302035A1 (en) * 2014-04-17 2015-10-22 Oracle International Corporation Partial indexes for partitioned tables
CN105512200A (zh) * 2015-11-26 2016-04-20 华为技术有限公司 一种分布式数据库处理的方法和设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831120A (zh) * 2011-06-15 2012-12-19 腾讯科技(深圳)有限公司 一种数据处理方法及系统
US20150302035A1 (en) * 2014-04-17 2015-10-22 Oracle International Corporation Partial indexes for partitioned tables
CN103995879A (zh) * 2014-05-27 2014-08-20 华为技术有限公司 基于olap系统的数据查询方法、装置及系统
CN105512200A (zh) * 2015-11-26 2016-04-20 华为技术有限公司 一种分布式数据库处理的方法和设备

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109254966A (zh) * 2018-08-23 2019-01-22 平安科技(深圳)有限公司 数据表查询方法、装置、计算机设备及存储介质
CN112541057A (zh) * 2019-09-04 2021-03-23 上海晶赞融宣科技有限公司 分布式新词发现方法、装置、计算机设备和存储介质
CN111639140A (zh) * 2020-06-08 2020-09-08 杭州复杂美科技有限公司 分布式数据存储方法、设备和存储介质
CN111651424A (zh) * 2020-06-10 2020-09-11 中国科学院深圳先进技术研究院 一种数据处理方法、装置、数据节点及存储介质
CN111651424B (zh) * 2020-06-10 2024-05-03 中国科学院深圳先进技术研究院 一种数据处理方法、装置、数据节点及存储介质
CN111708848B (zh) * 2020-06-12 2024-02-23 北京思特奇信息技术股份有限公司 一种数据查询方法、系统及电子设备
CN111708848A (zh) * 2020-06-12 2020-09-25 北京思特奇信息技术股份有限公司 一种数据查询方法、系统及电子设备
CN112182028A (zh) * 2020-09-29 2021-01-05 北京人大金仓信息技术股份有限公司 基于分布式数据库的表的数据行数查询方法和装置
CN113760981A (zh) * 2021-01-13 2021-12-07 北京京东乾石科技有限公司 一种数据查询方法和装置
CN113568924A (zh) * 2021-07-23 2021-10-29 北京达佳互联信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN113568924B (zh) * 2021-07-23 2024-05-14 北京达佳互联信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN115292356B (zh) * 2022-07-21 2023-06-16 中电金信软件有限公司 数据查询方法、装置及电子设备
CN115292356A (zh) * 2022-07-21 2022-11-04 中电金信软件有限公司 数据查询方法、装置及电子设备

Also Published As

Publication number Publication date
CN107784044A (zh) 2018-03-09
CN107784044B (zh) 2020-02-14

Similar Documents

Publication Publication Date Title
WO2018040722A1 (zh) 表数据查询方法及装置
CN107480198B (zh) 一种分布式NewSQL数据库系统和全文检索方法
EP3285178B1 (en) Data query method in crossing-partition database, and crossing-partition query device
US8489550B2 (en) Multi-tenancy data storage and access method and apparatus
CN110489417B (zh) 一种数据处理方法及相关设备
WO2018149271A1 (zh) 数据查询方法、装置及计算设备
US10311055B2 (en) Global query hint specification
US9817858B2 (en) Generating hash values
US11157473B2 (en) Multisource semantic partitioning
US8756217B2 (en) Speculative switch database
WO2015090244A2 (zh) 访问元数据的方法、服务器及系统
US20240054129A1 (en) Query optimization methods, apparatuses, and systems for secure multi-party database
CN105824957A (zh) 分布式内存列式数据库的查询引擎系统及查询方法
CN107451208B (zh) 一种数据搜索方法与装置
WO2021218144A1 (zh) 数据处理方法、装置、计算机设备及存储介质
WO2018201887A1 (zh) 数据响应方法、装置、终端设备及介质
WO2015110062A1 (zh) 一种分布式数据存储方法、装置和系统
US20130018918A1 (en) Repetitive Query Recognition and Processing
CN103400579A (zh) 一种语音识别系统和构建方法
CN103761102B (zh) 一种统一数据服务平台及其实现方法
WO2017088666A1 (zh) 一种数据存储方法和协调节点
CN107102995B (zh) 一种sql执行计划的确定方法及装置
WO2018107942A1 (en) System and method of adaptively partitioning data to speed up join queries on distributed and parallel database systems
US10592506B1 (en) Query hint specification
US8799329B2 (en) Asynchronously flattening graphs in relational stores

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17845025

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17845025

Country of ref document: EP

Kind code of ref document: A1