WO2021031407A1 - 实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质 - Google Patents

实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质 Download PDF

Info

Publication number
WO2021031407A1
WO2021031407A1 PCT/CN2019/118024 CN2019118024W WO2021031407A1 WO 2021031407 A1 WO2021031407 A1 WO 2021031407A1 CN 2019118024 W CN2019118024 W CN 2019118024W WO 2021031407 A1 WO2021031407 A1 WO 2021031407A1
Authority
WO
WIPO (PCT)
Prior art keywords
query
execution plan
data
heterogeneous database
logical
Prior art date
Application number
PCT/CN2019/118024
Other languages
English (en)
French (fr)
Inventor
倪程伟
汪涛
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021031407A1 publication Critical patent/WO2021031407A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Definitions

  • This application relates to the field of big data technology, and in particular to a method and device, electronic equipment, and computer-readable storage medium for realizing interactive query of data between heterogeneous databases.
  • a heterogeneous database system is a collection of multiple related databases. Although the structures of these databases are different from each other, they can realize data sharing and transparent access between the databases.
  • the technical department needs to provide data exchange support, coupled with other processes such as information security approval, resulting in the business department unable to be efficient Perform business data analysis.
  • an object of the present application is to provide a method and device, electronic equipment, and computer-readable storage medium for realizing interactive query of data between heterogeneous databases.
  • a method for realizing interactive query of data between heterogeneous databases includes: a heterogeneous database system receives a data query request initiated by a query client through a set scheduling node, and the heterogeneous database system is a plurality of heterogeneous databases.
  • a collection of databases through the scheduling node, the query sentence in the data query request is parsed into a logical execution plan, and the logical execution plan is processed in a distributed manner to obtain several sub-logic executions for the heterogeneous database Plan, the logical execution plan is a general execution logic for executing data queries on the heterogeneous database; by checking the status of the working nodes set by the heterogeneous database system, the sub-logic execution plan is assigned to idle Each working node in the state enables the working node to execute the data query of the heterogeneous database according to the assigned sub-logic execution plan; after the working node returns the result data of executing the data query to the scheduling node for summary And return the aggregated result data to the query client through the scheduling node.
  • an apparatus for realizing interactive data query between heterogeneous databases includes: a query request receiver configured to control the heterogeneous database system to receive the data query request initiated by the query client through the set scheduling node, so
  • the heterogeneous database system is a collection of several heterogeneous databases;
  • the logical execution plan converter is configured to parse the query statement in the data query request into a logical execution plan through the scheduling node, and to perform analysis on the logical execution plan Distributed processing is performed to obtain several sub-logic execution plans for the heterogeneous database.
  • the logical execution plan is a general execution logic for executing data queries on the heterogeneous database; the logical execution plan executor is configured to The working nodes set by the heterogeneous database system perform state detection, and the sub-logic execution plan is correspondingly assigned to each working node in the idle state, so that the working nodes execute the target heterogeneous database data according to the allocated sub-logic execution plan Query; query result obtainer configured to return the result data of executing the data query to the scheduling node for summary after the working node, and then return the summarized result data to the query client through the scheduling node end.
  • an electronic device includes a processor and a memory, and computer-readable instructions are stored on the memory.
  • the computer-readable instructions are executed by the processor, the above-mentioned implementation of heterogeneous database Data interactive query method.
  • a computer-readable storage medium has a computer program stored thereon, and when the computer program is executed by a processor, the method for realizing interactive data query between heterogeneous databases as described above is realized.
  • the heterogeneous database system parses the query statement in the data query request into a common logical execution plan between different heterogeneous databases through the scheduling node, and converts the logical execution plan into a number of subordinates for each heterogeneous database
  • the logical execution plan enables working nodes to execute data queries corresponding to heterogeneous databases according to the sub-logic execution plan.
  • the heterogeneous database system also detects the status of each working node, and allocates each sub-logic execution plan to idle working nodes, thereby realizing the reasonable configuration of heterogeneous database system resources.
  • the heterogeneous database system disclosed in this application performs the conversion and allocation of logical execution plans through scheduling nodes, and performs data queries of heterogeneous databases through working nodes, without data exchange between heterogeneous databases, thereby realizing heterogeneous databases Efficient query of data between.
  • Fig. 1 is a schematic diagram showing an implementation environment involved in this application according to an exemplary embodiment
  • Fig. 2 is a schematic diagram showing a heterogeneous database system according to an exemplary embodiment
  • Fig. 3 is a flow chart showing a method for implementing interactive data query between heterogeneous databases according to an exemplary embodiment
  • FIG. 4 is a flowchart illustrating step 220 according to the embodiment corresponding to FIG. 3;
  • FIG. 5 is a flowchart illustrating step 230 according to the embodiment corresponding to FIG. 3;
  • Fig. 6 is a flow chart showing a method for realizing interactive query of data between heterogeneous databases according to an exemplary embodiment
  • Fig. 7 is a block diagram of a device for implementing interactive data query between heterogeneous databases according to an exemplary embodiment
  • Fig. 8 is a hardware block diagram of an electronic device according to an exemplary embodiment.
  • Fig. 1 is a schematic diagram showing an implementation environment involved in this application according to an exemplary embodiment. As shown in FIG. 1, the implementation environment includes a query client 100 and a query server 200.
  • a wired or wireless network connection is established in advance between the query client 100 and the query server 200 to realize the interaction between the query client 100 and the query server 200.
  • the query client 100 is used to provide a user interaction interface for the user to query data from the query server 200 and display the query result.
  • the user interaction interface provided by the query client 100 is provided with an entry for inputting query instructions, and the user inputs query information such as query keywords, and the result data obtained by the query can be correspondingly displayed on the user interaction interface.
  • the query client 100 may be an electronic device such as a smart phone, a tablet computer, a notebook computer, or a computer, and the number thereof is not limited (only two are shown in FIG. 1).
  • the user interaction interface provided by the query client 100 may be a browser page or an APP (Application, application) page, which is not limited here.
  • the query server 200 is deployed with a heterogeneous database system, which is a collection of multiple related databases with different architectures.
  • the heterogeneous database system may include common databases such as Oracle, MySQL, and Postgre.
  • the architecture of each database is different, due to the association between the data stored in each database, it is usually necessary to perform an interactive query on each database to return the queried data set as a query result to the query client 100. As a result, data sharing and transparent access between heterogeneous databases can be realized.
  • the query server 200 can be a server with several related databases, or the query server 200 can also be a server cluster composed of several servers.
  • the database structures set by different servers are different, and there is no restriction here. .
  • Fig. 2 is a schematic diagram showing a heterogeneous database system according to an exemplary embodiment.
  • the heterogeneous database system is set on the query server 200 to realize data sharing and transparent access between the underlying databases.
  • the heterogeneous database system exposes the API (Application Programming Interface) of the scheduling node, so that the query client can send a data query request to the query server by calling the API of the scheduling node.
  • API Application Programming Interface
  • HDFS Distributed File System
  • scheduling nodes and several working nodes there are scheduling nodes and several working nodes in the heterogeneous database system.
  • the scheduling node is used to analyze and process the received data query request, obtain the common logical execution plan of each heterogeneous database, and convert the logical execution plan into several
  • the sub-logic execution plan is allocated to the working nodes for the working nodes to execute data queries of heterogeneous databases. It should be understood that scheduling nodes and working nodes refer to applications deployed by heterogeneous database systems that can independently perform designated tasks.
  • a backup scheduling node and a backup working node are set accordingly, and a data transfer module is set to ensure that each node and the backup node Consistency of data between.
  • the backup calling node or the backup working node is automatically enabled to continue to perform data query of the heterogeneous database, so that the overall performance of the heterogeneous database system will not be affected.
  • the heterogeneous database system performs the conversion and distribution of the logical execution plan through the scheduling node, and executes the data query of the heterogeneous database through the working node, without the need for data exchange between heterogeneous databases, thereby realizing data between heterogeneous databases Efficient query.
  • Fig. 3 is a flowchart of a method for realizing interactive query of data between heterogeneous databases according to an exemplary embodiment, and the method is suitable for the heterogeneous database system shown in Fig. 2. As shown in Figure 3, the method at least includes the following steps:
  • Step 210 The heterogeneous database system receives the data query request initiated by the query client through the set scheduling node.
  • the heterogeneous database system interacts with external devices through the scheduling node. Since the API corresponding to the scheduling node is exposed, external applications (such as query clients) can query from the heterogeneous database system by calling the API Data stored in association with different databases.
  • the query client After the query client obtains query information such as query keywords entered by the user, it will process the query information into a data query request, and send the data query request to the heterogeneous database system by calling the API exposed in the heterogeneous database system, and The data query request is received by the scheduling node in the heterogeneous database system.
  • query information such as query keywords entered by the user
  • the data query request is composed of the API name and parameters corresponding to the scheduling node, and specific query statements.
  • the query statement is SQL (Structured Query Language, Structured Query Language) statement.
  • Step 220 Through the scheduling node, the query statement in the data query request is parsed into a logical execution plan, and the logical execution plan is processed in a distributed manner to obtain several sub-logic execution plans for heterogeneous databases.
  • the logical execution plan is the general execution logic for executing data queries on the underlying database.
  • the problem is solved by parsing the query statement in the data query request into a common logical execution plan between heterogeneous databases.
  • the query statement is a standard ANSI SQL query statement. This is a relatively basic and standard structured query language, which can be well transformed into a common logical execution plan for underlying heterogeneous databases.
  • the parsing of the query statement in the data query request at least includes the process of semantic analysis and logical plan analysis of the query statement, and logical plan optimization.
  • semantic analysis includes the process of performing syntax checking and semantic checking on query sentences respectively.
  • the scheduling node obtains that there is a syntax error or semantic error in the query statement, it returns error information to the query client; after the query statement passes the semantic analysis, the query statement can be analyzed by logical plan to obtain the initial logical plan.
  • the initial logical execution plan contains some redundant information, the initial logical plan can be optimized by the plan optimizer to finally obtain a logical execution plan common to various heterogeneous databases.
  • the semantic analysis of the query statement can be realized by a syntax analyzer
  • the conversion of the query statement into the initial logical execution plan can be realized by the logic analyzer
  • the optimization of the initial logical execution plan can be realized by plan optimization.
  • the syntax analyzer, logic analyzer, and plan optimizer should also be understood as an application program that can independently perform specified processing tasks on query statements.
  • Distributed processing of logical execution plans is a process of converting logical execution sub-plans into several sub-logic execution plans.
  • the logical execution plan can be converted into a corresponding number of sub-logical execution plans according to the number of heterogeneous databases queried corresponding to the logical execution plan, and each sub-logical execution plan corresponds to executing a data query of a heterogeneous database. For example, suppose there are four heterogeneous databases A, B, C, and D in a heterogeneous database system.
  • the logical execution plan contains only data queries for heterogeneous databases A and C
  • the logical execution plan can be converted to separate Execute the data query in the heterogeneous database A and execute the sub-logic execution plan of the data query in the heterogeneous database B.
  • each sub-logic execution plan can execute at least one type of data query on the same heterogeneous database.
  • the logical execution plan is split into several sub-logical execution plans. For example, a certain sub-logic execution plan executes related data queries on heterogeneous databases A, B, and C.
  • the sub-logic execution plan since the sub-logic execution plan is obtained through the distributed processing of the logical execution plan, the sub-logic execution plan should also be commonly used in the various underlying heterogeneous databases in the heterogeneous database system.
  • Step 230 By performing state detection on the working nodes set by the heterogeneous database system, the sub-logic execution plan is correspondingly allocated to each working node in the idle state, so that the working node executes the data query of the heterogeneous database according to the allocated sub-logic execution plan .
  • the working node is used to perform data query of the underlying heterogeneous database in the heterogeneous database system. If the working node is executing data query of the heterogeneous database, it means that the working node is in the working state, otherwise the working node is in the idle state.
  • the currently idle working nodes of the heterogeneous database system can be obtained, and these idle working nodes can be used to execute the sub-logic execution plan to be allocated.
  • these working nodes can execute the data query corresponding to the heterogeneous database according to the allocated sub-logic execution plan.
  • corresponding to a certain sub-logic execution plan described in step 220 is to perform related data queries on heterogeneous databases A, B, and C, and the assigned work nodes are executed on heterogeneous databases A, B, and C, respectively.
  • Related data query is to perform related data queries on heterogeneous databases A, B, and C, and the assigned work nodes are executed on heterogeneous databases A, B, and C, respectively.
  • Step 240 After the working node returns the result data of the data query to the scheduling node for summary, the scheduling node returns the summarized result data to the query client.
  • the working node executes the assigned sub-logic execution plan to obtain the result data, it returns the result data to the scheduling node.
  • the scheduling node summarizes the result data according to the order of the received result data, and the result data obtained after the aggregation is the query result set corresponding to the data query request sent by the query client.
  • the scheduling node returns the aggregated result data to the query client according to the original path of the data query request, so that the query client can display the aggregated result data accordingly, thereby facilitating users to obtain query results.
  • the heterogeneous database system parses the query statement in the data query request into a common logical execution plan between different heterogeneous databases by setting a scheduling node, and converts the logical execution plan into a number of different heterogeneous databases.
  • the sub-logic execution plan enables working nodes to execute data queries of heterogeneous databases according to the sub-logic execution plan. Because there is no need to exchange data between heterogeneous databases in a heterogeneous database system, efficient data query between heterogeneous databases can be realized.
  • heterogeneous database system also detects the status of each working node, and allocates each sub-logic execution plan to idle working nodes, thereby realizing the reasonable configuration of heterogeneous database system resources.
  • the interactive query process of data between heterogeneous databases in this embodiment is all executed in memory, which can avoid redundant disk reads and writes and delays, and improve data query performance.
  • FIG. 4 is a flowchart of step 220 in an exemplary embodiment in the embodiment corresponding to FIG. 3. As shown in Figure 4, the process of optimizing the initial logic execution plan includes at least the following steps:
  • Step 221 For each logical clause in the initial logical execution plan, a locally optimized logical clause is obtained by rewriting the equivalent predicate and simplifying the specified conditions.
  • the equivalent predicates contained in the logic clauses in the initial logic execution plan can include common predicates such as like, betwe en, and, in, or.
  • Rewriting the equivalent predicate of the logical clause refers to converting the predicate contained in the logical clause to another predicate expression under the condition of satisfying the pre-set predicate conversion rules, and the new predicate obtained by the conversion is called Is an equivalent predicate.
  • the simplification of the specified conditions of the logical clause means that when there is no aggregate function in the logical clause, combining the having condition and the where condition in the logical clause can remove some redundancy in the logical clause. The remaining brackets and other information.
  • Step 222 Eliminate outer connections and nested connections between locally optimized logical clauses, and obtain an initial logical execution plan for association optimization.
  • the outer connection between logical clauses includes at least one of left outer connection, right outer connection, and full outer connection. Eliminating the outer connection refers to converting the outer connection between logical clauses into inner connection. It should be noted that converting the connection relationship between logical clauses into inner connections can effectively improve the speed of query operations corresponding to the logical clauses.
  • Step 223 Obtain a logical execution plan by performing semantic optimization on the associated optimized initial logical execution plan.
  • the semantic optimization of the initial logic execution plan of the association optimization obtained in step 222 may include moving up or down the grouping operation.
  • Moving up the grouping operation means that the grouping operation in the initial logic execution plan is executed later. If the connection operation can filter out most of the tuples, the grouping operation after the connection is performed first, which can improve the efficiency of grouping operations.
  • Shifting the grouping operation down means that the grouping operation in the initial logic execution plan is executed first.
  • the grouping operation can greatly reduce the number of relational tuples. If the grouping operation can be performed before the connection, the connection efficiency will be improved.
  • performing semantic optimization on the associated optimized initial logic execution plan may also include eliminating unnecessary sorting operations in the initial logic execution plan to avoid sorting operations or operations caused by sorting.
  • the optimization of the initial logic execution plan can be realized, such as eliminating some redundant information in the initial logic execution plan, optimizing the connection relationship between the logic clauses, etc., to obtain better logic Implementation plan.
  • FIG. 5 is a flowchart of step 230 in an exemplary embodiment in the embodiment corresponding to FIG. 3. As shown in FIG. 5, step 230 includes at least the following steps:
  • Step 231 by adding all the worker nodes in the heterogeneous database system to the thread pool, so that the multi-threads enabled in the thread pool respectively detect the status of each worker node.
  • the thread pool is a multi-threaded processing form.
  • the multi-threads execute the processing tasks added in the queue.
  • the threads in the thread pool respectively perform the status detection of each worker node to obtain each in real time. The status of the worker node.
  • the number of threads can be set according to the number of worker nodes set in the heterogeneous database system to meet the needs of the worker nodes. The need for multithreading.
  • Step 232 Assign the sub-logic execution plan to each working node in the idle state.
  • the sub-logic execution plans may be allocated to the idle working nodes.
  • the sub-logic execution plan can be allocated until the last sub-logic execution plan is allocated to the working node.
  • the sub-logic execution plans can also be assigned to the detected idle work nodes in sequence according to the execution sequence.
  • the multithreading enabled in the thread pool detects the status of the worker nodes in real time, which is very convenient for the allocation of the sub-logic execution plan to the worker nodes. .
  • Fig. 6 is a flowchart of a method for realizing interactive query of data between heterogeneous databases according to another exemplary embodiment. As shown in FIG. 6, before step 210, the method further includes the following steps:
  • Step 310 Receive the account information sent by the query client through the dispatch node, and verify the account information.
  • the account information sent by the query client may include the user name and password for logging in to the interface of the query client, and the account information sent by the query client is verified to realize the access authority control of the heterogeneous database system.
  • the verification of the account information sent by the client is implemented through LDAP (Lightweight Directory Access Protocol).
  • LDAP Lightweight Directory Access Protocol
  • the query client sends the logged-in user name and password to the dispatch node. After the dispatch node receives the user name and password, the user name and password are verified through the configured LDAP service.
  • the account information allowed to be accessed by the heterogeneous database is stored in the directory tree in advance, and each node in the directory tree is a piece of account information. Since the LDAP service is dynamic, it can dynamically update the account information allowed to be accessed by the heterogeneous database system.
  • the dispatch node When the dispatch node receives the account information sent by the query client, it traverses the directory tree to find whether the account information sent by the query client is stored in the directory tree to verify the account information, if it is, it means the account information Pass the verification, otherwise it is deemed to have failed the verification.
  • Step 320 After the account information is verified, the scheduling node opens the query client to access the heterogeneous database system.
  • the scheduling node opens the query client's permission to access the heterogeneous database system means: the scheduling node responds to the query client's access, and executes the bottom layer of the heterogeneous database system according to the data query request sent by the query client Data query of heterogeneous databases.
  • the account information fails the verification, it means that the query client does not have the authority to access the heterogeneous database system, and the heterogeneous database system does not open the API corresponding to the scheduling node, resulting in the query client being unable to send data query requests to the scheduling node.
  • the access authority control of the heterogeneous database system can be realized, and the security of the access of the heterogeneous database system can be increased.
  • the method for implementing interactive data query between heterogeneous databases further includes the following steps:
  • the account information of the query client is queried in the preset configuration file of the heterogeneous database system to obtain the query authority of the query client for the heterogeneous database system, so as to execute the data query of the heterogeneous database system according to the query authority.
  • heterogeneous databases A, B, and C are set to store ordinary business data in a heterogeneous database system
  • heterogeneous database D is set to store important business data.
  • Ordinary business personnel are limited to querying heterogeneous databases A, B, and C.
  • business managers can also query important business data in the heterogeneous database D.
  • heterogeneous databases A and B are set to store the business data of the first business department
  • heterogeneous databases C and D are set to store the business data of the second business department.
  • the staff of each department can only access the department’s business data.
  • Business data is possible to perform role control on the query authority of business data.
  • the preset configuration file refers to the query authority list preset by the heterogeneous database system for different business personnel.
  • the configuration file specifies the target heterogeneous database that can perform query operations for each account information, and the types of query operations that can be performed on the target database.
  • the scheduling node obtains the data query request, according to the current user's account information, by matching the account information in the configuration file, the current user's query authority for each heterogeneous database is obtained, thereby obtaining the current query client for the heterogeneous database
  • the query authority of the system is obtained.
  • the logical execution plan obtained by the scheduling node by parsing the query statement contains only the data query execution logic on the heterogeneous database authorized to query.
  • the sub-logic execution plan executed by the working node is also oriented to the heterogeneous database allowed by the query authority, and the role control of the query client to query the heterogeneous database system is realized.
  • Fig. 7 shows a device for realizing interactive data query between heterogeneous databases according to an exemplary embodiment.
  • the device includes a query request receiver 410, a logical execution plan converter 420, a logical execution plan executor 430, and a query result obtainer 440.
  • the query request receiver 410 is configured to control the heterogeneous database system to receive the data query request initiated by the query client through the set scheduling node, and the heterogeneous database system is a collection of several heterogeneous databases.
  • the logical execution plan converter 420 is configured to parse the query statement in the data query request into a logical execution plan through the scheduling node, and perform distributed processing on the logical execution plan to obtain several sub-logic execution plans for heterogeneous databases.
  • the execution plan is the general execution logic for executing data queries on heterogeneous databases.
  • the logic execution plan executor 430 is configured to detect the status of the working nodes set by the heterogeneous database system, and assign the sub-logic execution plan to each idle state correspondingly, so that the working nodes execute the target according to the assigned sub-logic execution plan Data query of heterogeneous databases.
  • the query result obtainer 440 is configured to return the result data of the executed data query to the scheduling node for summary after the working node, and then return the summarized result data to the query client through the scheduling node.
  • the logical execution plan converter 420 includes a semantic analyzer, an initial plan obtainer, and an initial plan optimizer.
  • the semantic analyzer is configured to perform semantic analysis on standard ANSI SQL query statements.
  • the initial plan obtainer is configured to perform logical execution plan analysis on the standard ANSI SQL query statements output by the semantic device to obtain the initial logical execution plan.
  • the initial plan optimizer is configured to obtain a logical execution plan by optimizing the initial logical execution plan.
  • the initial plan optimizer includes a local optimizer, an associative optimizer, and a semantic optimizer.
  • the local optimizer is configured to obtain a locally optimized logical clause by rewriting the equivalent predicate and simplifying the specified conditions for each logical clause in the initial logical execution plan.
  • the associative optimizer is configured to eliminate outer joins and nested joins between locally optimized logical clauses, and obtain an initial logical execution plan for associative optimization.
  • the semantic optimizer is configured to obtain the logical execution plan by performing semantic optimization on the associated optimized initial logical execution plan.
  • the logical execution plan executor 430 includes a multi-thread detector and a plan allocator.
  • the multi-thread detector is configured to add all the worker nodes in the heterogeneous database system to the thread pool so that the multi-threads enabled in the thread pool detect the status of each worker node separately.
  • the plan allocator is configured to allocate the sub-logic execution plan to each work node in the idle state.
  • the device further includes an account information verifier and an access authority enforcer.
  • the account information verifier is configured to receive the account information sent by the query client through the scheduling node, and verify the account information.
  • the access authority executor is configured to open the query client's authority to access the heterogeneous database system after the account information is verified.
  • the device further includes an access role controller.
  • the access role controller is configured to query the account information of the query client in the preset configuration file of the heterogeneous database system, and obtain the query authority of the query client for the heterogeneous database, so as to execute the data of the heterogeneous database system according to the query authority Inquire.
  • the present application further provides an electronic device, which includes:
  • a memory where computer-readable instructions are stored on the memory, and when the computer-readable instructions are executed by the processor, the method for realizing interactive data query between heterogeneous databases as described above is realized.
  • Fig. 8 is a block diagram of an electronic device according to an exemplary embodiment.
  • the electronic device may be specifically implemented as the query server 200 in the implementation environment shown in FIG. 1.
  • the electronic device is only an example adapted to this application, and cannot be considered as providing any restriction on the scope of use of this application.
  • the electronic device also cannot be interpreted as being dependent on or having one or more components in the exemplary electronic device shown in FIG. 8.
  • the electronic device includes: a power supply 610, an interface 630, at least one memory 650, and at least one central processing unit (CPU, Central Processing Units) 670.
  • the power supply 610 is used to provide working voltage for each hardware device on the electronic device.
  • the interface 630 includes at least one wired or wireless network interface 631, at least one serial-to-parallel conversion interface 633, at least one input/output interface 635, at least one USB interface 637, etc., for communicating with external devices.
  • the memory 650 can be read-only memory, random access memory, magnetic disks or optical discs, etc.
  • the resources stored on it include operating system 651, application programs 653 or data 655, etc.
  • the storage method can be short-term storage or permanent storage.
  • the operating system 651 is used to manage and control various hardware devices and application programs 653 on the electronic device to realize the calculation and processing of the massive data 655 by the central processing unit 670. It can be Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM Wait.
  • the application program 653 is a computer program that completes at least one specific task based on the operating system 651. It may include at least one module (not shown in FIG. 8), and each module may include a series of computer programs for electronic devices. Readable instructions.
  • the data 655 may be interface metadata stored in a disk or the like.
  • the central processing unit 670 may include one or more processors, and is configured to communicate with the memory 650 via a bus, and is used for computing and processing the massive data 655 in the memory 650.
  • the electronic device applicable to this application will read a series of computer-readable instructions stored in the memory 650 through the central processing unit 670 to complete the data interaction between heterogeneous databases as described above.
  • the query method As described in detail above, the electronic device applicable to this application will read a series of computer-readable instructions stored in the memory 650 through the central processing unit 670 to complete the data interaction between heterogeneous databases as described above.
  • the query method As described in detail above, the electronic device applicable to this application will read a series of computer-readable instructions stored in the memory 650 through the central processing unit 670 to complete the data interaction between heterogeneous databases as described above.
  • the query method will be any combination of computer-readable instructions stored in the memory 650 through the central processing unit 670 to complete the data interaction between heterogeneous databases as described above.
  • the application can also be implemented by hardware circuits or hardware circuits in combination with software instructions. Therefore, implementation of the application is not limited to any specific hardware circuits, software, and combinations of both.
  • the present application also provides a computer-readable storage medium on which a computer program is stored.
  • a computer program is stored on which a computer program is stored.
  • the computer program is executed by the processor, the data interaction between heterogeneous databases is realized as described above. Query method.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种实现异构数据库之间数据交互式查询的方法及装置,涉及大数据技术领域。该方法包括:异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求;通过调度节点,将数据查询请求中的查询语句解析为逻辑执行计划,且对逻辑执行计划进行分布式处理,获得面向异构数据库的若干子逻辑执行计划;通过对异构数据库系统所设置的工作节点进行状态检测,将子逻辑执行计划对应分配至空闲状态的各个工作节点,使工作节点根据所分配子逻辑执行计划执行异构数据库的数据查询;在工作节点将执行数据查询的结果数据返回至调度节点汇总后,通过调度节点将汇总的结果数据返回至查询客户端。本申请能够实现异构数据库之间数据的高效查询。

Description

实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质 技术领域
本申请要求2019年8月16日递交、申请名称为“实现异构数据库之间数据交互式查询的方法和相关装置”的中国专利申请201910759791.9的优先权,在此通过引用将其全部内容合并于此。
本申请涉及大数据技术领域,尤其涉及一种实现异构数据库之间数据交互式查询的方法及装置、电子设备、计算机可读存储介质。
背景技术
异构数据库系统是相关的多个数据库的集合,虽然这些数据库的架构互不相同,但可以实现各数据库之间数据的共享和透明访问。
发明人意识到,由于各异构数据库之间的数据相互关联,在对异构数据库系统执行数据查询时,必须通过数据交换的方式先将各个异构数据库中的数据导入同一数据库中,然后在此数据库中进行异构数据库之间的交互式数据查询,查询过程十分繁杂。在实际的业务场景中,如果业务部门想要对公司所部署多个业务系统的数据进行分析,需要技术部门提供数据交换的支持,再加上信息安全审批等其他流程,导致业务部门无法高效率进行业务数据的分析。
发明概述
技术问题
由此,如何实现异构数据库之间数据的高效查询是亟待解决的技术问题。
问题的解决方案
技术解决方案
为了解决上述技术问题,本申请的一个目的在于提供一种实现异构数据库之间数据交互式查询的方法及装置、电子设备、计算机可读存储介质。
其中,本申请所采用的技术方案为:
一方面,一种实现异构数据库之间数据交互式查询的方法,包括:异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行所述异构数据库的数据查询;在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
另一方面,一种实现异构数据库之间数据交互式查询的装置,包括:查询请求接收器,配置为控制异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;逻辑执行计划转换器,配置为通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;逻辑执行计划执行器,配置为通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行目标异构数据库的数据查询;查询结果获取器,配置为在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
另一方面,一种电子设备,包括处理器及存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时实现如上所述的实现异构数据库之间数据交互式查询的方法。
另一方面,一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的实现异构数据库之间数据交互式查询的方法。
在上述技术方案中,异构数据库系统通过调度节点将数据查询请求中的查询语句解析为不同异构数据库之间通用的逻辑执行计划,并将逻辑执行计划转换为面向各异构数据库的若干子逻辑执行计划,使得工作节点能够根据子逻辑执行计划执行对应异构数据库的数据查询。异构数据库系统还通过对各个工作节点进行状态检测,分别将各子逻辑执行计划分配置至空闲状态的工作节点,实现了异构数据库系统资源的合理配置。因此,本申请所揭示异构数据库系统通过调度节点进行逻辑执行计划的转换和分配,并通过工作节点执行异构数据库的数据查询,无需进行异构数据库之间的数据交换,从而实现异构数据库之间数据的高效查询。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
发明的有益效果
对附图的简要说明
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并于说明书一起用于解释本申请的原理。
图1是根据一示例性实施例示出的本申请所涉及实施环境的示意图;
图2是根据一示例性实施例示出的一种异构数据库系统的示意图;
图3是根据一示例性实施例示出的一种实现异构数据库之间数据交互式查询的方法的流程图;
图4是根据图3所对应实施例示出的对步骤220进行描述的流程图;
图5是根据图3所对应实施例示出的对步骤230进行描述的流程图;
图6是根据一示例性实施例示出的一种实现异构数据库之间数据交互式查询的方法的流程图;
图7是根据一示例性实施例所示出的一种实现异构数据库之间数据交互式查询的装置的框图;
图8是根据一示例性实施例所示出的一种电子设备的硬件框图。
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述,这些 附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。
发明实施例
本发明的实施方式
这里将详细地对示例性实施例执行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
图1是根据一示例性实施例示出的一种本申请所涉及实施环境的示意图。如图1所示,该实施环境包括查询客户端100和查询服务端200。
其中,查询客户端100与查询服务端200之间预先建立有线或者无线网络连接,以实现查询客户端100与查询服务端200之间的交互。
查询客户端100用于提供用户交互界面,以供用户向查询服务端200查询数据,并将查询结果进行展示。示例性的,查询客户端100所提供用户交互界面设置有查询指令输入的入口,用户通过输入查询关键字等查询信息,查询得到的结果数据即可在用户交互界面进行相应展示。
示例性的,查询客户端100可以是智能手机、平板电脑、笔记本电脑、计算机等电子设备,其数量不作限制(图1仅示出2个)。查询客户端100所提供用户交互界面可以是浏览器页面,或者是APP(Application,应用程序)页面,本处不进行限制。
查询服务端200部署有异构数据库系统,是多个相关且架构不同的数据库集合,示例性的,异构数据库系统中可包括Oracle、MySQL、Postgre等常见数据库。虽然各数据库架构不同,但由于各数据库所存储数据之间的关联,通常需要对各数据库进行交互式查询,以将查询到的数据集作为查询结果返回给查询客户端100。由此,各异构数据库之间可以实现数据共享和透明访问。
查询服务端200可以是一台服务器,且设置有若干个相关的数据库,或者,查询服务端200也可以是由若干服务器构成的服务器集群,不同服务器所设置数据 库的架构不同,本处不进行限制。
图2是根据一示例性实施例示出的一种异构数据库系统的示意图。如前所述的,该异构数据库系统设置于查询服务端200,以实现底层各数据库之间数据的共享和透明访问。
如图2所示,异构数据库系统将调度节点的API(Application Programming Interface,应用程序编程接口)暴露,使得查询客户端通过调用调度节点的API,以向查询服务端发送数据查询请求。
多个相关的异构数据库构成分布式文件系统(HDFS),作为异构数据库系统的数据底层,以提供数据查询的数据源。
异构数据库系统中设置有调度节点和若干工作节点,其中,调度节点用于对接收的数据查询请求进行分析和处理,获得各异构数据库通用的逻辑执行计划,并将逻辑执行计划转换为若干子逻辑执行计划分配至工作节点,以供工作节点执行异构数据库的数据查询。应当理解,调度节点和工作节点是指异构数据库系统所部署的可独立执行指定任务的应用程序。
在一个示例性的实施例中,针对异构数据库系统中所设置的调度节点和各工作节点,相应设置有备份调度节点及备份工作节点,且通过设置数据转移模块,以保证各节点和备份节点之间数据的一致性。
例如,若当前调用节点或者工作节点出现功能异常,则自动启用备份调用节点或者备份工作节点继续执行异构数据库的数据查询,从而不会影响异构数据库系统的整体性能。
由此,异构数据库系统通过调度节点进行逻辑执行计划的转换和分配,并通过工作节点执行异构数据库的数据查询,无需进行异构数据库之间的数据交换,从而实现异构数据库之间数据的高效查询。
图3是根据以示例性实施例示出的一种实现异构数据库之间数据交互式查询的方法的流程图,该方法适用于图2所示异构数据库系统。如图3所示,该方法至少包括以下步骤:
步骤210,异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求。
其中,异构数据库系统通过调度节点与外部设备进行数据交互,由于调度节点所对应API被暴露,使得外部应用程序(例如查询客户端)通过进行该API的调用,能够从异构数据库系统中查询不同数据库所关联存储的数据。
查询客户端在获取用户所输入查询关键字等查询信息后,会将查询信息处理为数据查询请求,并通过调用异构数据库系统中暴露的API,将数据查询请求发送至异构数据库系统,且由异构数据库系统中的调度节点执行数据查询请求的接收。
示例性的,数据查询请求由调度节点所对应API名称和参数、以及具体的查询语句构成。查询语句为SQL(Structured Query Language,结构化查询语言)语句。
步骤220,通过调度节点,将数据查询请求中的查询语句解析为逻辑执行计划,且逻辑执行计划进行分布式处理,获得面向异构数据库的若干子逻辑执行计划。
其中,逻辑执行计划是对底层数据库执行数据查询的通用执行逻辑。不同异构数据库在执行数据查询操作时,由于受查询语言的限制,导致现有实现无法根据数据查询请求直接进行异构数据库的交互式查询,必须进行数据交换。而本实施例将数据查询请求中的查询语句解析为异构数据库之间通用的逻辑执行计划就很好地解决这个问题。
在一示例性的实施例中,查询语句为标准ANSI SQL查询语句。这是一种相对基础和标准的结构化查询语言,能够很好地转换为底层异构数据库通用的逻辑执行计划。
调度节点接收数据查询请求后,对数据查询请求中查询语句的解析至少包括对查询语句进行语义分析和逻辑计划分析,以及逻辑计划优化的过程。
其中,语义分析包括分别对查询语句进行语法检查和语义检查的过程。当调度节点获取到查询语句存在语法错误或者语义错误,则返回错误信息给查询客户端;当查询语句通过语义分析后,即可对查询语句进行逻辑计划分析,以获得初始逻辑计划。并且,由于初始逻辑执行计划存在部分冗杂信息,可通过计划优化器对初始逻辑计划进行优化,以最终获得通用于各异构数据库的逻辑执行 计划。
需要说明的是,对查询语句进行语义分析可以是通过语法分析器实现的,将查询语句转换为初始逻辑执行计划可以是通过逻辑分析器实现的,对初始逻辑执行计划的优化可以是通过计划优化器实现的,该语法分析器、逻辑分析器以及计划优化器也应当理解为是一种可对查询语句独立执行指定处理任务的应用程序。
对逻辑执行计划进行分布式处理,是将逻辑执行子计划转换为若干子逻辑执行计划的过程。示例性的,可按照逻辑执行计划所对应查询的异构数据库的数量,将逻辑执行计划转换为相应数量的子逻辑执行计划,且每一子逻辑执行计划对应执行一个异构数据库的数据查询。例如,假设异构数据库系统中设置有A、B、C、D四个异构数据库,当逻辑执行计划中仅包含对异构数据库A和C的数据查询,则可以将逻辑执行计划转换为分别执行异构数据库A中的数据查询以及执行异构数据库B中的数据查询的子逻辑执行计划。需要说明,每一子逻辑执行计划可以对同一异构数据库执行至少一种数据的查询。
或者,由于各异构数据库之间所存储数据的关联性,对某个数据的查询需要依赖于其他异构数据库的查询结果,因此,可按照各异构数据库之间数据查询的关联性,将逻辑执行计划拆分为若干子逻辑执行计划。例如,某一子逻辑执行计划执行对异构数据库A、B和C的相关数据查询。
需要说明的是,由于子逻辑执行计划是通过逻辑执行计划的分布式处理所得,子逻辑执行计划也应当通用于异构数据库系统中的各底层异构数据库。
步骤230,通过对异构数据库系统所设置的工作节点进行状态检测,将子逻辑执行计划对应分配至空闲状态的各个工作节点,使工作节点根据所分配子逻辑执行计划执行异构数据库的数据查询。
其中,工作节点用于执行异构数据库系统中底层异构数据库的数据查询。如果工作节点正在执行异构数据库的数据查询,则表示该工作节点处于工作状态,否则该工作节点处于空闲状态。
通过对工作节点的状态检测,能够获取异构数据库系统当前空闲的工作节点,这些空闲的工作节点则可用于执行待分配的子逻辑执行计划。通过将分布式处 理所得的子逻辑执行计划分配至空闲状态的工作节点,即可使得这些工作节点根据所分配的子逻辑执行计划执行对应异构数据库的数据查询。
示例性的,对应于步骤220中所描述某一子逻辑执行计划是执行对异构数据库A、B和C的相关数据查询,所分配的工作节点则分别对异构数据库A、B和C执行相关的数据查询
步骤240,在工作节点将执行数据查询的结果数据返回至调度节点汇总后,通过调度节点将汇总的结果数据返回至查询客户端。
其中,工作节点执行所分配子逻辑执行计划获得结果数据后,将结果数据返回至调度节点。调度节点则按照所接收结果数据的顺序,将结果数据进行汇总,所得到汇总后的结果数据即为与查询客户端所发送数据查询请求相对应的查询结果集。
调度节点按照数据查询请求的原路径,将汇总后的结果数据返回至查询客户端,使得查询客户端度对汇总后的结果数据进行相应展示,从而便于用户获取查询结果。
在本实施例中,异构数据库系统通过设置调度节点将数据查询请求中的查询语句解析为不同异构数据库之间通用的逻辑执行计划,并将逻辑执行计划转换为面向各异构数据库的若干子逻辑执行计划,使得工作节点能够根据子逻辑执行计划执行异构数据库的数据查询。由于异构数据库系统中无需进行异构数据库之间的数据交换,能够实现异构数据库之间数据的高效查询。
并且,异构数据库系统还通过对各个工作节点进行状态检测,分别将各子逻辑执行计划分配置至空闲状态的工作节点,实现了异构数据库系统资源的合理配置。
此外,本实施例中实现异构数据库之间数据的交互式查询过程都是在内存中执行的,能够避免冗余的磁盘读写和延迟,提升数据查询性能。
图4是图3对应实施例中步骤220在一个示例性实施例中的流程图。如图4所示,对初始逻辑执行计划进行优化的过程至少包括以下步骤:
步骤221,对初始逻辑执行计划中的每一逻辑子句,通过进行等价谓词的重写和指定条件的简化,获得局部优化的逻辑子句。
其中,初始逻辑执行计划中的逻辑子句所含有的等价谓词可以包括like、betwe en、and、in、or等常见谓词。
对逻辑子句进行等价谓词的重写是指,在满足预先设定的谓词转换规则的条件下,将逻辑子句中含有的谓词转换为另一谓词表达,所转换得到新的谓词则称为等价谓词。
示例性的,假设逻辑子句为“sno between 10 and 20”,在满足between-and规则的条件下,可将其重写为“sno>10 and sno<=20”,将and称为between的等价谓词。同理,假设逻辑子句为“name like Abc”,在满足like规则的条件下可将其重写为“name>=Abc and name<Abc”,这时将and称为like的等价谓词。
对逻辑子句进行指定条件的简化则是指,在逻辑子句中不存在聚集函数的情况下,将逻辑子句中存在的having条件与where条件进行合并,能够去除逻辑子句中的一些冗余括号等信息。
步骤222,消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优化的初始逻辑执行计划。
其中,逻辑子句之间的外连接包括左外连接、右外连接以及全外连接中的至少一种,消除外连接是指,将逻辑子句之间的外连接转换为内连接。需要说明的是,将逻辑子句之间连接关系转换为内连接,能够有效提升逻辑子句所对应查询操作的速度。
逻辑子句之间的嵌套连接是指,逻辑子句之间执行连接操作的次序不是从左到右逐个进行的。需要说明的是,消除逻辑子句之间的嵌套连接,是指消除嵌套连接中的括号信息,例如对于语句“select*from a join(b join c on b.b1=c.c1)on a.a1=b.b1 where a.a1>1;”去掉括号对语义没有影响,可以消除。
由此,通过对逻辑子句之间的连接关系进行优化,可以得到关联优化的初始逻辑执行计划。
步骤223,通过对关联优化的初始逻辑执行计划进行语义优化,获得逻辑执行计划。
其中,对步骤222所得到的关联优化的初始逻辑执行计划进行语义优化可以包 括分组操作上移或者下移。
分组操作上移是指,将初始逻辑执行计划中的分组操作置后执行。如果连接操作能够过滤掉大部分元组,则先进行连接后进行分组操作,可以提高分组操作效率。
分组操作下移则是指,将初始逻辑执行计划中的分组操作置前执行。分组操作可以较大幅度地减少关系元组的个数,如果能先进行分组操作再进行连接,会提高连接效率。
此外,对关联优化的初始逻辑执行计划进行语义优化还可以包括,将初始逻辑执行计划中没有必要的排序操作消除,避免出现排序操作或由排序导致的操作。
由此,通过本实施例提供的方法,能够实现对初始逻辑执行计划的优化,例如消除初始逻辑执行计划中的一些冗余信息,优化逻辑子句之间的连接关系等,获得更优的逻辑执行计划。
图5是图3对应实施例中步骤230在一个示例性的实施例中的流程图。如图5所示,步骤230至少包括以下步骤:
步骤231,通过将异构数据库系统中的全部工作节点添加至线程池,使线程池中启用的多线程分别检测各个工作节点的状态。
其中,线程池是一种多线程的处理形式,在线程池的启用中,多线程各自执行队列中所添加的处理任务。通过将异构数据库系统中的全部工作节点添加至线程池,且将多线程所要执行的任务添加至任务队列,使得线程池中的线程分别执行每一工作节点的状态检测,以实时获得每一工作节点的状态。
在一示例性的实施例中,由于线程过多会造成调度开销,进而影响异构数据库系统的整体性能,则可按照异构数据库系统所设置工作节点的数量相应设置线程数,以满足工作节点对多线程的需求。
步骤232,分别将子逻辑执行计划分配至空闲状态的各个工作节点。
其中,在一个示例性的实施例中,可以在检测到空闲状态的工作节点数量与子逻辑执行计划的数量相匹配后,分别将子逻辑执行计划分配至空闲状态的各工作节点。
但考虑到异构数据库系统中资源的合理配置,在检测到一空闲状态的工作节点时,即可进行子逻辑执行计划的分配,直至将最后一子逻辑执行计划分配至工作节点。
如果不同子逻辑执行计划之间具有执行先后顺序,也可按照该执行先后顺序,依次将子逻辑执行计划分配至检测到的空闲状态的工作节点。
由此,在本实施例中,通过将异构数据库中的全部工作节点添加至线程池,使得线程池中启用的多线程对工作节点状态实时检测,十分便于子逻辑执行计划对工作节点的分配。
图6是另一示例性实施例示出的一种实现异构数据库之间数据交互式查询的方法的流程图。如图6所示,在步骤210前,该方法还包括以下步骤:
步骤310,通过调度节点接收查询客户端发送的账户信息,且对该账户信息进行验证。
其中,查询客户端所发送账户信息可以包括登录查询客户端界面的用户名和密码,通过对查询客户端所发送账户信息进行验证,以实现异构数据库系统的访问权限控制。
示例性的,客户端所发送账户信息的验证是通过LDAP(轻量目录访问协议)实现的。当用户在查询客户端界面中登录时,查询客户端将登录的用户名和密码发送至调度节点,调度节点接收用户名和密码后,通过所配置的LDAP服务进行用户名和密码的验证。
在所配置LDAP服务中,预先将异构数据库允许访问的账户信息存储在目录树,目录树中的每个节点为一条账户信息。由于LDAP服务是动态的,可以对异构数据库系统允许访问的账户信息进行动态更新。
当调度节点接收到查询客户端发送的账户信息后,通过遍历目录树,查找目录树中是否存储有查询客户端所发送账户信息,以对该账户信息进行验证,如果是,则表示该账户信息通过验证,否则视为未通过验证。
步骤320,在该账户信息通过验证后,调度节点开放查询客户端访问异构数据库系统的权限。
其中,当账户信息通过验证后,调度节点开放查询客户端访问异构数据库系统 的权限是指:调度节点响应查询客户端的访问,并根据查询客户端所发送数据查询请求执行异构数据库系统中底层异构数据库的数据查询。
若账户信息未通过验证,则表示查询客户端不具有访问异构数据库系统的权限,异构数据库系统不开放调度节点所对应API,导致查询客户端无法向调度节点发送数据查询请求。
因此,本实施例通过配置LDAP服务,能够实现异构数据库系统的访问权限控制,增加了异构数据库系统访问的安全性。
在另一示例性的实施例中,在步骤210之后,实现异构数据库之间数据交互式查询的方法还包括以下步骤:
将查询客户端的账户信息在异构数据库系统预设的配置文件中进行查询,获取查询客户端面向异构数据库系统的查询权限,以根据该查询权限执行对异构数据库系统的数据查询。
其中,考虑到在实际的业务场景中,不同用户对业务数据的获取需求往往不同,并且考虑到业务数据的访问安全,有必要对业务数据的查询权限进行角色控制。例如,假设异构数据库系统中设置异构数据库A、B和C存储普通业务数据,以及设置异构数据库D存储重要业务数据,普通的业务人员仅限于查询异构数据库A、B和C中的普通业务数据,业务经理还能够查询异构数据库D中的重要业务数据。或者,异构数据库系统中设置异构数据库A和B存储第一业务部门的业务数据,设置异构数据库C和D存储第二业务部门的业务数据,各部门所属工作人员仅能够访问所属部门的业务数据。
预设的配置文件是指,异构数据库系统针对不同业务人员预先设置的查询权限列表。示例性的,该配置文件中规定了每一账户信息所能够执行查询操作的目标异构数据库、以及对目标数据库所能够执行的查询操作类型等。
当调度节点获取数据查询请求时,根据当前用户的账户信息,通过在配置文件中进行账户信息的匹配,获得当前用户对各个异构数据库的查询权限,由此获得当前查询客户端面向异构数据库系统的查询权限。
在根据所获取查询权限执行对异构数据库系统的数据查询中,调度节点通过解析查询语句获得的逻辑执行计划,仅包含对授权查询的异构数据库的数据查询 执行逻辑。对于未授权查询的异构数据库,则在调度节点生成逻辑执行计划中过滤。由此,工作节点执行的子逻辑执行计划也是面向查询权限所允许异构数据库的,实现了查询客户端对异构数据库系统进行数据查询的角色控制。
图7是根据一示例性实施例示出的一种实现异构数据库之间数据交互式查询的装置。如图7所示,该装置包括查询请求接收器410、逻辑执行计划转换器420、逻辑执行计划执行器430和查询结果获取器440。
其中,查询请求接收器410配置为控制异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,该异构数据库系统是若干异构数据库的集合。
逻辑执行计划转换器420配置为通过调度节点,将数据查询请求中的查询语句解析为逻辑执行计划,且对逻辑执行计划进行分布式处理,获得面向异构数据库的若干子逻辑执行计划,该逻辑执行计划是对异构数据库执行数据查询的通用执行逻辑。
逻辑执行计划执行器430配置为通过对异构数据库系统所设置的工作节点进行状态检测,将子逻辑执行计划对应分配至空闲状态的各个工作节点,使工作节点根据所分配子逻辑执行计划执行目标异构数据库的数据查询。
查询结果获取器440配置为在工作节点将执行数据查询的结果数据返回至调度节点汇总后,通过调度节点将汇总的结果数据返回至查询客户端。
在另一示例性的实施例中,逻辑执行计划转换器420包括语义分析器、初始计划获取器和初始计划优化器。
其中,语义分析器配置为对标准ANSI SQL查询语句进行语义分析。
初始计划获取器配置为对语义器输出的标准ANSI SQL查询语句进行逻辑执行计划分析,获得初始逻辑执行计划。
初始计划优化器配置为通过对初始逻辑执行计划进行优化,获得逻辑执行计划。
在另一示例性的实施例中,初始计划优化器包括局部优化器、关联优化器和语义优化器。
局部优化器配置为对初始逻辑执行计划中的每一逻辑子句,通过进行等价谓词 的重写和指定条件的简化,获得局部优化的逻辑子句。
关联优化器配置为消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优化的初始逻辑执行计划。
语义优化器配置为通过对关联优化的初始逻辑执行计划进行语义优化,获得逻辑执行计划。
在另一示例性的实施例中,逻辑执行计划执行器430包括多线程检测器和计划分配器。
其中,多线程检测器配置为通过将异构数据库系统中的全部工作节点添加至线程池,使线程池中启用的多线程分别检测各个工作节点的状态。
计划分配器配置为分别将子逻辑执行计划分配至空闲状态的各个工作节点。
在另一示例性的实施例中,该装置还包括账户信息验证器和访问权限执行器。
其中,账户信息验证器配置为通过调度节点接收查询客户端发送的账户信息,且对账户信息进行验证。
访问权限执行器配置为在账户信息通过验证后,调度节点开放查询客户端访问异构数据库系统的权限。
在另一示例性的实施例中,该装置还包括访问角色控制器。访问角色控制器配置为将查询客户端的账户信息在异构数据库系统预设的配置文件中进行查询,获取查询客户端面向异构数据库的查询权限,以根据查询权限执行对异构数据库系统的数据查询。
需要说明的是,上述实施例所提供的装置与上述实施例所提供的方法属于同一构思,其中各个模块执行操作的具体方式已经在方法实施例中进行了详细描述,此处不再赘述。
在一示例性的实施例中,本申请还提供一种电子设备,该电子设备包括:
处理器;
存储器,该存储器上存储有计算机可读指令,该计算机可读指令被处理器执行时,实现如前所述实现异构数据库之间数据交互式查询的方法。
图8是根据一示例性实施例所示出的一种电子设备的框图。该电子设备可以被具体实现为图1所示实施环境中的查询服务端200。
需要说明的是,该电子设备只是一个适配于本申请的示例,不能认为是提供了对本申请的使用范围的任何限制。该电子设备也不能解释为需要依赖于或者必须具有图8中示出的示例性的电子设备中的一个或者多个组件。
该电子设备的硬件结构可因配置或者性能的不同而产生较大的差异,如图8所示,电子设备包括:电源610、接口630、至少一存储器650、以及至少一中央处理器(CPU,Central Processing Units)670。
其中,电源610用于为电子设备上的各硬件设备提供工作电压。
接口630包括至少一有线或无线网络接口631、至少一串并转换接口633、至少一输入输出接口635以及至少一USB接口637等,用于与外部设备通信。
存储器650作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源包括操作系统651、应用程序653或者数据655等,存储方式可以是短暂存储或者永久存储。其中,操作系统651用于管理与控制电子设备上的各硬件设备以及应用程序653,以实现中央处理器670对海量数据655的计算与处理,其可以是Windows ServerTM、Mac OS XTM、UnixTM、LinuxTM等。应用程序653是基于操作系统651之上完成至少一项特定工作的计算机程序,其可以包括至少一模块(图8中未示出),每个模块都可以分别包含有对电子设备的一系列计算机可读指令。数据655可以是存储于磁盘中的接口元数据等。
中央处理器670可以包括一个或多个以上的处理器,并设置为通过总线与存储器650通信,用于运算与处理存储器650中的海量数据655。
如上面所详细描述的,适用本申请的电子设备将通过中央处理器670读取存储器650中存储的一系列计算机可读指令的形式来完成如前所述的实现异构数据库之间数据交互式查询的方法。
此外,通过硬件电路或者硬件电路结合软件指令也能同样实现本申请,因此,实现本申请并不限于任何特定硬件电路、软件以及两者的组合。
在一示例性的实施例中,本申请还提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时,实现如前所述实现异构数据库之间数据交互式查询的方法。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构, 并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。

Claims (28)

  1. 一种实现异构数据库之间数据交互式查询的方法,包括:
    异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;
    通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;
    通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行所述异构数据库的数据查询;
    在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
  2. 如权利要求1所述的方法,其中,所述查询语句为标准ANSI SQL查询语句,所述通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,包括:
    对所述标准ANSI SQL查询语句进行语义分析;
    在进行所述语义分析后,通过对所述标准ANSI SQL查询语句进行逻辑执行计划分析,获得初始逻辑执行计划;
    通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划。
  3. 如权利要求2所述的方法,其中,所述通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划,包括:
    对所述初始逻辑执行计划中的每一逻辑子句,通过进行等价谓词的重写和指定条件的简化,获得局部优化的逻辑子句;
    消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优 化的初始逻辑执行计划;
    通过对所述关联优化的初始逻辑执行计划进行语义优化,获得所述逻辑执行计划。
  4. 如权利要求1至3任一项所述的方法,其中,所述通过对所述异构数据库系统所设置工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,包括:
    通过将所述异构数据库系统中的全部工作节点添加至线程池,使所述线程池中启用的多线程分别检测各个工作节点的状态;
    分别将所述子逻辑执行计划分配至空闲状态的各个工作节点。
  5. 如权利要求1至3任一项所述的方法,其中,所述在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端,包括:
    按照所述工作节点所返回的结果数据的顺序,所述调度节点对所述结果数据进行汇总,且将汇总后的结果数据按照所述数据查询请求的原路径返回至所述查询客户端。
  6. 如权利要求1至3任一项所述的方法,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之前,所述方法还包括:
    通过所述调度节点接收所述查询客户端发送的账户信息,且对所述账户信息进行验证;
    在所述账户信息通过所述验证后,所述调度节点开放所述查询客户端访问所述异构数据库系统的权限。
  7. 如权利要求6所述的方法,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之后,所述方法还包括:
    将所述查询客户端的账户信息在所述异构数据库系统预设的配置文件中进行查询,获取所述查询客户端面向所述异构数据库的查 询权限,以根据所述查询权限执行对所述异构数据库系统的数据查询。
  8. 一种实现异构数据户之间数据交互式查询的装置,包括:
    查询请求接收器,配置为控制异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;
    逻辑执行计划转换器,配置为通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;
    逻辑执行计划执行器,配置为通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行目标异构数据库的数据查询;
    查询结果获取器,配置为在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
  9. 如权利要求8所述的装置,其中,所述逻辑执行计划转换器包括语义分析器、初始计划获取器和初始计划优化器;
    所述语义分析器配置为对标准ANSI SQL查询语句进行语义分析;
    所述初始计划获取器配置为对语义器输出的标准ANSI SQL查询语句进行逻辑执行计划分析,获得初始逻辑执行计划;
    所述初始计划优化器配置为通过对初始逻辑执行计划进行优化,获得逻辑执行计划。
  10. 如权利要求9所述的装置,其中,所述初始计划优化器包括局部优化子器、关联优化子器和语义优化子器;
    所述局部优化子器配置为对初始逻辑执行计划中的每一逻辑子句 ,通过进行等价谓词的重写和指定条件的简化,获得局部优化的逻辑子句;
    所述关联优化子器配置为消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优化的初始逻辑执行计划;
    所述语义优化子器配置为通过对关联优化的初始逻辑执行计划进行语义优化,获得逻辑执行计划。
  11. 如权利要求8至10任一项所述的装置,其中,所述逻辑执行计划执行器包括多线程检测器和计划分配器;
    所述多线程检测器配置为通过将异构数据库系统中的全部工作节点添加至线程池,使线程池中启用的多线程分别检测各个工作节点的状态;
    所述计划分配器配置为分别将子逻辑执行计划分配至空闲状态的各个工作节点。
  12. 如权利要求8至10任一项所述的装置,其中,查询结果获取器配置为:
    按照所述工作节点所返回的结果数据的顺序,所述调度节点对所述结果数据进行汇总,且将汇总后的结果数据按照所述数据查询请求的原路径返回至所述查询客户端。
  13. 如权利要求8至10任一项所述的装置,所述装置还包括:
    账户信息验证器,配置为通过调度节点接收查询客户端发送的账户信息,且对账户信息进行验证;
    访问权限执行器,配置为在账户信息通过验证后,调度节点开放查询客户端访问异构数据库系统的权限。
    参数更新器,配置为通过最小化输入语句分类偏差对目标参数矩阵进行更新。
  14. 如权利要求13所述的装置,所述装置还包括:
    角色控制器,配置为将所述查询客户端的账户信息在所述异构数据库系统预设的配置文件中进行查询,获取所述查询客户端面向 所述异构数据库的查询权限,以根据所述查询权限执行对所述异构数据库系统的数据查询。
  15. 一种电子设备,包括:
    处理器;
    及存储器,所述存储器上存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,所述处理器配置为实现以下步骤:
    异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;
    通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;
    通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行所述异构数据库的数据查询;
    在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
  16. 如权利要求15所述的电子设备,其中,所述查询语句为标准ANSI SQL查询语句,所述通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,所述处理器配置为实现以下步骤:
    对所述标准ANSI SQL查询语句进行语义分析;
    在进行所述语义分析后,通过对所述标准ANSI SQL查询语句进行逻辑执行计划分析,获得初始逻辑执行计划;
    通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划。
  17. 如权利要求16所述的电子设备,其中,所述通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划,所述处理器配置为实现以下步骤:
    对所述初始逻辑执行计划中的每一逻辑子句,通过进行等价谓词的重写和指定条件的简化,获得局部优化的逻辑子句;
    消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优化的初始逻辑执行计划;
    通过对所述关联优化的初始逻辑执行计划进行语义优化,获得所述逻辑执行计划。
  18. 如权利要求15至17任一项所述的电子设备,其中,所述通过对所述异构数据库系统所设置工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,所述处理器配置为实现以下步骤:
    通过将所述异构数据库系统中的全部工作节点添加至线程池,使所述线程池中启用的多线程分别检测各个工作节点的状态;
    分别将所述子逻辑执行计划分配至空闲状态的各个工作节点。
  19. 如权利要求15至17任一项所述的电子设备,其中,所述在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端,所述处理器配置为实现以下步骤:
    按照所述工作节点所返回的结果数据的顺序,所述调度节点对所述结果数据进行汇总,且将汇总后的结果数据按照所述数据查询请求的原路径返回至所述查询客户端。
  20. 如权利要求15-17任一项所述的电子设备,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之前,所述处理器还配置为实现以下步骤:
    通过所述调度节点接收所述查询客户端发送的账户信息,且对所述账户信息进行验证;
    在所述账户信息通过所述验证后,所述调度节点开放所述查询客户端访问所述异构数据库系统的权限。
  21. 如权利要求20所述的电子设备,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之后,所述处理器还配置为实现以下步骤:
    将所述查询客户端的账户信息在所述异构数据库系统预设的配置文件中进行查询,获取所述查询客户端面向所述异构数据库的查询权限,以根据所述查询权限执行对所述异构数据库系统的数据查询。
  22. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时,所述处理器配置为实现以下步骤:
    异构数据库系统通过所设置的调度节点接收查询客户端发起的数据查询请求,所述异构数据库系统是若干异构数据库的集合;
    通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,且对所述逻辑执行计划进行分布式处理,获得面向所述异构数据库的若干子逻辑执行计划,所述逻辑执行计划是对所述异构数据库执行数据查询的通用执行逻辑;
    通过对所述异构数据库系统所设置的工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,使所述工作节点根据所分配子逻辑执行计划执行所述异构数据库的数据查询;
    在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端。
  23. 如权利要求22所述的计算机可读存储介质,其中,所述查询语句为标准ANSI SQL查询语句,所述通过所述调度节点,将所述数据查询请求中的查询语句解析为逻辑执行计划,所述处理器配置为实现以下步骤:
    对所述标准ANSI SQL查询语句进行语义分析;
    在进行所述语义分析后,通过对所述标准ANSI SQL查询语句进行逻辑执行计划分析,获得初始逻辑执行计划;
    通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划。
  24. 如权利要求23所述的计算机可读存储介质,其中,所述通过对所述初始逻辑执行计划进行优化,获得所述逻辑执行计划,所述处理器配置为实现以下步骤:
    对所述初始逻辑执行计划中的每一逻辑子句,通过进行等价谓词的重写和指定条件的简化,获得局部优化的逻辑子句;
    消除局部优化的逻辑子句之间的外连接和嵌套连接,获得关联优化的初始逻辑执行计划;
    通过对所述关联优化的初始逻辑执行计划进行语义优化,获得所述逻辑执行计划。
  25. 如权利要求22至24任一项所述的计算机可读存储介质,其中,所述通过对所述异构数据库系统所设置工作节点进行状态检测,将所述子逻辑执行计划对应分配至空闲状态的各个工作节点,所述处理器配置为实现以下步骤:
    通过将所述异构数据库系统中的全部工作节点添加至线程池,使所述线程池中启用的多线程分别检测各个工作节点的状态;
    分别将所述子逻辑执行计划分配至空闲状态的各个工作节点。
  26. 如权利要求22至24任一项所述的计算机可读存储介质,其中,所述在所述工作节点将执行所述数据查询的结果数据返回至所述调度节点汇总后,通过所述调度节点将汇总的所述结果数据返回至所述查询客户端,所述处理器配置为实现以下步骤:
    按照所述工作节点所返回的结果数据的顺序,所述调度节点对所述结果数据进行汇总,且将汇总后的结果数据按照所述数据查询请求的原路径返回至所述查询客户端。
  27. 如权利要求22-24任一项所述的计算机可读存储介质,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之前,所述处理器还配置为实现以下步骤:
    通过所述调度节点接收所述查询客户端发送的账户信息,且对所述账户信息进行验证;
    在所述账户信息通过所述验证后,所述调度节点开放所述查询客户端访问所述异构数据库系统的权限。
  28. 如权利要求27所述的计算机可读存储介质,其中,在所述异构数据库系统根据所设置调度节点接收查询客户端发起的数据查询请求之后,所述处理器还配置为实现以下步骤:
    将所述查询客户端的账户信息在所述异构数据库系统预设的配置文件中进行查询,获取所述查询客户端面向所述异构数据库的查询权限,以根据所述查询权限执行对所述异构数据库系统的数据查询。
PCT/CN2019/118024 2019-08-16 2019-11-13 实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质 WO2021031407A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910759791.9A CN110659327A (zh) 2019-08-16 2019-08-16 实现异构数据库之间数据交互式查询的方法和相关装置
CN201910759791.9 2019-08-16

Publications (1)

Publication Number Publication Date
WO2021031407A1 true WO2021031407A1 (zh) 2021-02-25

Family

ID=69037680

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118024 WO2021031407A1 (zh) 2019-08-16 2019-11-13 实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质

Country Status (2)

Country Link
CN (1) CN110659327A (zh)
WO (1) WO2021031407A1 (zh)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111625558A (zh) * 2020-05-07 2020-09-04 苏州浪潮智能科技有限公司 一种服务器架构及其数据库查询方法和存储介质
WO2021254288A1 (en) * 2020-06-14 2021-12-23 Wenfei Fan Querying shared data with security heterogeneity
CN111737284A (zh) * 2020-08-18 2020-10-02 北京升鑫网络科技有限公司 一种基于管道的数据库查询分析方法、装置及计算设备
CN112685142A (zh) * 2020-12-30 2021-04-20 北京明朝万达科技股份有限公司 分布式数据处理系统
CN113093681A (zh) * 2021-04-08 2021-07-09 四川远星橡胶有限责任公司 一种基于超融合和服务器虚拟化的控制系统及方法
CN113918996B (zh) * 2021-11-24 2024-03-26 企查查科技股份有限公司 分布式数据处理方法、装置、计算机设备和存储介质
CN116263776A (zh) * 2021-12-15 2023-06-16 华为技术有限公司 一种针对数据库的数据访问方法、装置及设备
CN114756577A (zh) * 2022-03-25 2022-07-15 北京友友天宇系统技术有限公司 多源异构数据的处理方法、计算机设备及存储介质
CN115033595B (zh) * 2022-08-10 2022-11-22 杭州悦数科技有限公司 基于超级节点的查询语句处理方法、系统、装置和介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912624A (zh) * 2016-04-07 2016-08-31 北京中安智达科技有限公司 分布式部署的异构数据库的查询方法
CN107329814A (zh) * 2017-06-16 2017-11-07 电子科技大学 一种基于rdma的分布式内存数据库查询引擎系统
CN109656968A (zh) * 2018-11-15 2019-04-19 中国建设银行股份有限公司 分布式环境下的数据查询方法、装置及存储介质

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7979422B2 (en) * 2008-07-30 2011-07-12 Oracle International Corp. Hybrid optimization strategies in automatic SQL tuning
CN101694665B (zh) * 2009-10-27 2012-10-03 中兴通讯股份有限公司 一种异构数据源数据查询方法及装置
US9317414B2 (en) * 2014-03-04 2016-04-19 International Business Machines Corporation Regression testing of SQL execution plans for SQL statements
CN106445991B (zh) * 2016-06-30 2019-03-08 中国石化销售有限公司 加气站scada系统海量数据处理方法
CN106844545A (zh) * 2016-12-30 2017-06-13 江苏瑞中数据股份有限公司 一种基于标准sql的双引擎数据库系统的实现方法
CN107315790B (zh) * 2017-06-14 2021-07-06 腾讯科技(深圳)有限公司 一种非相关子查询的优化方法和装置
CN108052635A (zh) * 2017-12-20 2018-05-18 江苏瑞中数据股份有限公司 一种异构数据源统一联合查询方法
CN109284282A (zh) * 2018-10-22 2019-01-29 北京极数云舟科技有限公司 一种基于MySQL数据库运维方法和系统
CN110059103B (zh) * 2019-04-28 2023-06-06 南京大学 一种跨平台统一的大数据sql查询方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105912624A (zh) * 2016-04-07 2016-08-31 北京中安智达科技有限公司 分布式部署的异构数据库的查询方法
CN107329814A (zh) * 2017-06-16 2017-11-07 电子科技大学 一种基于rdma的分布式内存数据库查询引擎系统
CN109656968A (zh) * 2018-11-15 2019-04-19 中国建设银行股份有限公司 分布式环境下的数据查询方法、装置及存储介质

Also Published As

Publication number Publication date
CN110659327A (zh) 2020-01-07

Similar Documents

Publication Publication Date Title
WO2021031407A1 (zh) 实现异构数据库之间数据交互式查询的方法、装置、电子设备、存储介质
US11100103B2 (en) Data sharing in multi-tenant database systems
US8572575B2 (en) Debugging a map reduce application on a cluster
US11003664B2 (en) Efficient hybrid parallelization for in-memory scans
US8903841B2 (en) System and method of massively parallel data processing
Lee et al. Ysmart: Yet another sql-to-mapreduce translator
US9576000B2 (en) Adaptive fragment assignment for processing file data in a database
US7092954B1 (en) Optimizing an equi-join operation using a bitmap index structure
US9529881B2 (en) Difference determination in a database environment
US11914591B2 (en) Sharing materialized views in multiple tenant database systems
US7917501B2 (en) Optimization of abstract rule processing
US6470331B1 (en) Very large table reduction in parallel processing database systems
Chen et al. Grasper: A high performance distributed system for OLAP on property graphs
Adaikkalavan et al. Multilevel secure data stream processing
US10474653B2 (en) Flexible in-memory column store placement
Yuan et al. VDB-MR: MapReduce-based distributed data integration using virtual database
US20220318314A1 (en) System and method of performing a query processing in a database system using distributed in-memory technique
US20140379691A1 (en) Database query processing with reduce function configuration
US9852172B2 (en) Facilitating handling of crashes in concurrent execution environments of server systems while processing user queries for data retrieval
US11500874B2 (en) Systems and methods for linking metric data to resources
Gowraj et al. S2mart: smart sql to map-reduce translators
Lin et al. Anser: Adaptive Information Sharing Framework of AnalyticDB
Rong et al. Scaling a Declarative Cluster Manager Architecture with Query Optimization Techniques
US20230281055A1 (en) Allocation of worker threads in a parallelization framework with result streaming
Tang et al. Online application of science and technology program oriented distributed heterogeneous data integration

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19942609

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19942609

Country of ref document: EP

Kind code of ref document: A1