CN112202859B - Data transmission method and database system - Google Patents

Data transmission method and database system Download PDF

Info

Publication number
CN112202859B
CN112202859B CN202011001547.5A CN202011001547A CN112202859B CN 112202859 B CN112202859 B CN 112202859B CN 202011001547 A CN202011001547 A CN 202011001547A CN 112202859 B CN112202859 B CN 112202859B
Authority
CN
China
Prior art keywords
computing
instance
data
instruction set
computing instances
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011001547.5A
Other languages
Chinese (zh)
Other versions
CN112202859A (en
Inventor
王鸿翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingbase Information Technologies Co Ltd
Original Assignee
Beijing Kingbase Information Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingbase Information Technologies Co Ltd filed Critical Beijing Kingbase Information Technologies Co Ltd
Priority to CN202011001547.5A priority Critical patent/CN112202859B/en
Publication of CN112202859A publication Critical patent/CN112202859A/en
Application granted granted Critical
Publication of CN112202859B publication Critical patent/CN112202859B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Abstract

The present disclosure provides a data transmission method and a database system. The method comprises the following steps: the method comprises the steps that a main instance in a database system respectively sends instruction sets to a plurality of computing instances, each computing instance executes the instruction sets to obtain execution results of the instruction sets corresponding to the computing instances, and the computing instances of the plurality of executing instruction sets send the execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the main instance. According to the method disclosed by the invention, when the database performs the query task, the main instance receives the execution results of the instruction sets corresponding to the calculation instances respectively sent by the calculation instances of the plurality of execution instruction sets through the same receiving port, so that the accuracy of query is ensured, too many receiving ports occupied by data transmission are reduced, the performance of the database system is improved, the requirements of mass data storage analysis and calculation are met, and the deployment of a larger-scale database system is realized.

Description

Data transmission method and database system
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data transmission method and a database system.
Background
The shared-free distributed database system comprises a main node and a plurality of distributed computing nodes (hosts), wherein the main node is provided with a main instance, each computing node can be provided with a plurality of computing instances, network connection based on a transmission control protocol (Transmission Control Protocol, TCP for short) is adopted between the main instance and each computing instance for data transmission, before data transmission, TCP network connection is required to be established between the main node and each computing instance in the database system, and a TCP port is required to be respectively allocated to a sender and a receiver of each network connection.
For a large-scale distributed database, a large number of computing nodes need to be deployed, multiple computing instances may be deployed in each computing node, and in a scenario where query tasks are concurrent or even highly concurrent, data transmission between each computing instance and a main instance needs to occupy a large number of TCP ports, however, the number of TCP ports is limited.
Thus, large-scale database systems cannot be deployed based on the TCP protocol.
Disclosure of Invention
To solve the above technical problems or at least partially solve the above technical problems, the present disclosure provides a data transmission method and a database system.
In a first aspect, the present disclosure provides a data transmission method applied to a database system, where the database system includes: a master instance and a plurality of computing instances, the method comprising:
the main instance respectively sends instruction sets to the plurality of computing instances;
each computing instance executes the instruction set to obtain an execution result of the instruction set corresponding to the computing instance;
and the plurality of computing instances executing the instruction set send the execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the main instance.
Optionally, the method further comprises:
each computing instance receives data sent by other computing instances through the same receiving port, and sends the data to the other computing instances through the same sending port.
Optionally, the instruction set includes a plurality of sub-instruction sets;
each computing instance executes the instruction set to obtain an execution result of the instruction set corresponding to the computing instance, including:
each computing instance starts a plurality of executors to correspondingly execute a plurality of sub-instruction sets in the instruction set, and an execution result of the instruction set corresponding to the computing instance is obtained; each executor receives data sent by other executors through the same receiving port, and sends the data to the other executors through the same sending port.
Optionally, each computing instance receives data sent by other computing instances through the same receiving port, and sends the data to the other computing instances through the same sending port, including:
each computing instance receives data sent by other computing instances based on the RUDP and sends data to the other computing instances based on the RUDP.
Optionally, each computing instance receives data sent by other computing instances through the same receiving port, and sends the data to the other computing instances through the same sending port, including:
each computing instance receives data sent by other computing instances based on UDP, and sends data to other computing instances based on UDP.
Optionally, each of the actuators receives data sent by other actuators through the same receiving port, and sends the data to the other actuators through the same sending port, including:
each of the actuators receives data transmitted from the other actuators based on the RUDP, and transmits data to the other actuators based on the RUDP.
Optionally, each of the actuators receives data sent by other actuators through the same receiving port, and sends the data to the other actuators through the same sending port, including:
each of the actuators receives data transmitted by the other actuator based on the UDP, and transmits data to the other actuator based on the UDP.
Optionally, the multiple computing instances executing the instruction set send execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the main instance, including:
and the plurality of computing instances executing the instruction sets send the execution results of the instruction sets corresponding to the computing instances to the main instance respectively based on a reliable user datagram protocol RUDP.
Optionally, the multiple computing instances executing the instruction set send execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the main instance, including:
and the plurality of computing instances executing the instruction sets send the execution results of the instruction sets corresponding to the computing instances to the main instance respectively based on a user datagram protocol UDP.
In a second aspect, the present disclosure provides a database system comprising: a master instance and a plurality of computing instances;
the database system is configured to perform the data transmission method of the database system according to the first aspect.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: the method comprises the steps that a master node in a database system respectively sends instruction sets to a plurality of computing nodes, after each computing node receives the instruction sets sent by the master node, one computing instance is started to execute the instruction sets to obtain execution results of the instruction sets corresponding to the computing instances, and the computing instances of the plurality of executing instruction sets send the execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the master node. When the database performs the query task, the master node receives the execution results of the instruction sets corresponding to the calculation examples respectively sent by the calculation examples of the plurality of execution instruction sets through the same receiving port, so that the accuracy of the query is ensured, the excessive receiving ports occupied by data transmission are reduced, the performance of the database system is improved, the requirements of mass data storage analysis and calculation are met, and the deployment of the database system with larger scale is realized.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments of the present disclosure or the solutions in the prior art, the drawings that are required for the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1A is a schematic diagram of a database system interacting with a client;
FIG. 1B is a schematic diagram of a database system;
fig. 2 is an interaction schematic diagram of a data transmission method according to an embodiment of the disclosure;
fig. 3 is an interaction schematic diagram of another data transmission method according to an embodiment of the disclosure;
fig. 4 is an interaction schematic diagram of still another data transmission method according to an embodiment of the disclosure;
FIG. 5 is a flow diagram of a database system performing a query task;
FIG. 6 is a flow chart of another database system performing a query task.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, a further description of aspects of the present disclosure will be provided below. It should be noted that, without conflict, the embodiments of the present disclosure and features in the embodiments may be combined with each other.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced otherwise than as described herein; it will be apparent that the embodiments in the specification are only some, but not all, embodiments of the disclosure.
First, the terms involved in the present invention will be explained:
a distributed database is a logically unified database formed by connecting physically dispersed database units together by a computer network. Wherein each connected database unit is referred to as a compute node. The distributed database includes at least two compute nodes. The computing nodes can be physical computing nodes distributed in different places or logical computing nodes distributed in the same physical database.
Connection (Join), the most important query in the database, combines the query results of two or more tables together.
Grouping (group by) refers to grouping result sets according to one or more columns to achieve more accurate classification of data.
FIG. 1A is a schematic diagram of interaction between a database system and clients, as shown in FIG. 1A, at least one client is connected to a shared-nothing distributed database system. The shared-nothing distributed database may include a master node and a plurality of computing nodes, where the client is connected to the master node, as shown in fig. 1A, as 3 computing nodes, namely, computing node 1, computing node 2, and computing node 3, respectively, and it is understood that the number of computing nodes in fig. 1A is only an example, and is not a limitation of the present disclosure, where the master node is a server or a terminal device, and the master node deploys a master instance. One computing node is a server or terminal device, and one computing node may deploy one or more computing instances.
The master instance stores a system table including, but not limited to: node information of the computing nodes, distribution information of computing instances in the computing nodes, metadata of the data table and distribution conditions of data in the data table in each computing instance. In a database system, the master instance does not store data in a data table, but rather stores data in the respective data tables of the respective compute instances according to a distribution rule.
If the database needs to perform tasks such as query, the client sends a query request to the main instance of the main node, and optionally, the query request may be a structured query language (Structured Query Language, abbreviated as SQL) statement.
The main instance determines an instruction set according to the query request and the distribution condition of data in each computing instance, wherein the instruction set is a working step which needs to be completed by each computing instance determined according to the query request, so after the main instance determines the instruction set, the main instance sends the instruction set to each computing instance, and each computing instance executes the instruction set in parallel. Optionally, if the query request is an SQL statement, operations such as grammar, lexical, semantic analysis, query rewrite, query optimization, etc. may be performed on the SQL statement, so as to determine an instruction set that needs to be executed by each computing instance.
The main instance respectively sends instruction sets to a plurality of computing instances, each computing instance performs corresponding database operation according to the received instruction sets, an execution result of the instruction sets is obtained, and the execution result of the instruction sets is sent to the main instance.
And the main instance performs operations such as aggregation and the like on the execution results sent by the received computing instances to obtain a final query result, and sends the final query result to the client to complete the query task.
Fig. 1B is a schematic structural diagram of a database system, and fig. 1B is a schematic structural diagram further illustrating a structure of a distributed database system based on fig. 1A, where, as shown in fig. 1B, a distribution rule of each computing instance is stored in a main instance, and the main instance is used to determine an instruction set when performing a query task. The calculation examples may store data tables of the database, the calculation example 1 stores tables 1 and 2, the calculation example 2 stores tables 1 and 2, the calculation example 3 stores tables 1 and 2, the tables 1 stored in the respective calculation examples may be tables with the same structure but different table data, and the table 2 stored in the respective calculation examples may be tables with the same structure but different table data. Each computing instance may execute a set of instructions sent by the master instance.
An application scenario of the present disclosure is described below in conjunction with the database system architecture described above.
With the rapid development of the internet and the internet of things, data growth presents an explosive trend, and more applications adopt a shared-nothing distributed database for storage and calculation. At present, a shared-free distributed database adopts a large-scale parallel processing technology (massively parallel processing, MPP), when the database is queried, a parallel mode is adopted among computing instances to query, data transmission needs to be carried out among computing instances in the query and between the computing instances and a main instance, wherein the shared-free distributed database adopts a transmission control protocol (Transmission Control Protocol, abbreviated as TCP) to carry out data transmission, and before the TCP is adopted to carry out data transmission, TCP network connection needs to be established between the main instance and each two computing instances in the database, and a TCP port number is allocated to each network connection.
For a large-scale distributed database, a large number of computing nodes need to be deployed, when a database query task is performed, multiple computing instances may be deployed in each computing node at the same time, and in a scenario of concurrency or even high concurrency of the query task, a large number of TCP ports are occupied for data transmission between each computing instance and a main instance, however, the number of TCP ports is limited. Thus, large-scale distributed databases cannot be deployed based on the TCP protocol.
The data transmission method has the advantages that when the database system executes the query task, the main instance receives the execution results of the instruction sets corresponding to the calculation instances respectively sent by the calculation instances of the plurality of execution instruction sets through the same receiving port, the correctness of the query is ensured, the excessive receiving ports occupied by data transmission are reduced, the performance of the database system is improved, the requirements of mass data storage analysis and calculation are met, and the deployment of the database system with a larger scale is realized.
The following describes the technical scheme of the present disclosure and how the technical scheme of the present disclosure solves the above technical problems in detail with specific embodiments.
Fig. 2 is an interaction schematic diagram of a data transmission method according to an embodiment of the present disclosure, as shown in fig. 2, where the method of the present embodiment is performed by a database system, and the database system includes a main instance and a plurality of computing instances, and fig. 2 illustrates 2 computing instances, which are respectively computing instance 1 and computing instance 2, where the number of computing instances is not limited in this disclosure. The method of this embodiment is as follows:
s201, the main instance respectively sends instruction sets to a plurality of computing instances.
When the query task exists, the main instance generates an instruction set corresponding to each computing instance according to the distribution condition of data in the computing instance, the main instance can send the instruction set corresponding to all computing instances to each computing instance, or can respectively send the instruction set corresponding to each computing instance, wherein the instruction set comprises but not limited to database operations which can be executed by the corresponding computing instance, the database operations comprise but not limited to redistribution, join, group by and the like, wherein the redistribution is to redistribute the data stored in each computing instance in the database into each computing instance according to a certain rule, for example, the data in the database adopts to take a hash value for a certain column of data (distributed column), and the data with the same or similar hash value is stored in one computing instance according to the hash value, so that the distributed storage of the database is completed through the hash distribution. Wherein the instruction set may also be called an execution plan.
Optionally, before S201, the method may further include: the main instance determines an instruction set according to a query request sent by the client.
S202, executing an instruction set by each computing instance to obtain an execution result of the instruction set corresponding to the computing instance.
When two data tables with non-distributed columns are needed to be connected or the tables with the non-distributed columns are grouped, and the like, each computing instance allocates storage resources and computing resources for executing the instruction set, and the computing instance executes instructions in the instruction set to obtain an execution result of the instruction set corresponding to the computing instance. Alternatively, the distribution rule may be that data with the same hash value is distributed in the same computing instance, or that data with hash values in the same segment interval is distributed in the same computing instance according to the segment interval, and the method for storing data according to hash values is not limited in the present invention.
S203, the computing examples of the plurality of executing instruction sets send the executing results of the instruction sets corresponding to the computing examples to the same receiving port of the main example.
When the main instance receives the execution results of the instruction sets corresponding to the multiple computing instances, the same receiving port is used for receiving the execution results of the instruction sets, the receiving port of the main instance has a unique port identification in the database system, and the main instance receives the execution results of the instruction sets corresponding to the computing instances respectively sent by the computing instances of the multiple computing instruction sets through the port identification.
In one possible implementation, the computing instances of the multiple execution instruction sets send the execution results of their respective corresponding instruction sets to the master instance based on user datagram protocol (User Datagram Protocol, UDP for short).
Optionally, on the basis of UDP, a confirmation mechanism for sending data, a retransmission mechanism for data transmission failure, a congestion control mechanism and other mechanisms can be added, so that the reliability of data transmission is ensured.
Since the data transmission based on the UDP does not need to establish a long connection, one receiving port can receive the data sent by a plurality of ports, and therefore, based on the UDP, the main instance can use the same receiving port to receive the execution results sent by a plurality of computing instances.
Further, a UDP-based network connection may be established in advance.
In another possible implementation manner, the computing instances of the multiple execution instruction sets send the execution results of their respective corresponding instruction sets to the master instance based on the reliable user datagram protocol (Reliable User Datagram Protocol, referred to as RUDP for short).
The RUDP is based on UDP and adds protocol contents such as a data retransmission mechanism, thereby ensuring the correctness of the transmitted data and realizing reliable data transmission. For example, in a data transmission process based on the RUDP, a data transmitting end controls a data amount to be transmitted through a sliding window, so as to realize failure retransmission, message confirmation and congestion control, wherein a data packet transmitted by the transmitting end includes an incremental sequence number, if a receiving end receives the data packet, acknowledgement information is transmitted to the transmitting end, and data ordering and assembling are performed according to the incremental sequence number of the received data packet, so as to obtain correct data, if the transmitting end does not receive the acknowledgement message from the receiving end after transmitting the data packet in a preset time period, the transmitting end retransmits the data, thereby ensuring the correctness of the transmitted data, reducing the occupation of excessive receiving ports by data transmission, and realizing reliable data transmission.
Further, a network connection based on the RUDP may be established in advance.
Optionally, before S203, the master instance allocates a receiving port, and calculates an instance allocation transmitting port.
In this embodiment, an instruction set is sent to multiple computing instances by a main instance in a database system, each computing instance executes the instruction set to obtain an execution result of an instruction set corresponding to the computing instance, and the computing instances of the multiple executing instruction sets send the execution results of the instruction sets corresponding to the computing instances to the same receiving port of the main instance. When the database performs the query task, the main instance receives the execution results of the instruction sets corresponding to the calculation instances respectively sent by the calculation instances of the plurality of execution instruction sets through the same receiving port, so that the accuracy of the query is ensured, the excessive receiving ports occupied by data transmission are reduced, the performance of the database system is improved, the requirements of mass data storage analysis and calculation are met, and the deployment of the database system with larger scale is realized.
Further, on the basis of the foregoing embodiment, the main instance parses the received query task to generate an instruction set that can be executed by each computing instance, where the instruction set may or may not include a redistribution operation, and if the instruction set includes the redistribution operation, the redistribution process involves data transmission between each computing instance, where the redistribution process may include: in the process of executing the redistribution operation, the computing instance needs to read the table to be redistributed stored in the current computing instance, hash the data in the non-distributed columns in the computing instance, and redistribute the hashed data into the computing instance of each computing instance according to the distribution rule.
However, for a large-scale distributed database, a large number of computing nodes need to be deployed, multiple computing instances may be deployed in each computing node, in the process of executing a corresponding instruction set, if there is redistribution, the computing instances also involve data migration, that is, data transmission and reception between computing instances, in the process of redistribution, if data transmission and reception of the computing instances are performed by using TCP, one receiving port and one transmitting port are required for data transmission between every two computing instances, the more computing instances in a database system will occupy more ports, and in a concurrent or high-concurrency scenario, the more ports will be occupied, that is, the database system needs to allocate a large number of port numbers between computing instances, however, TCP ports are limited, for example, TCP ports in a linux operating system can only use the port numbers of 1025-65535, which may cause the system to fail to complete a normal query task. The following describes further how embodiments of the present disclosure solve the above-described problems with specific examples.
Fig. 3 is an interaction schematic diagram of another data transmission method of a database system according to an embodiment of the present disclosure, and fig. 3 is a diagram further illustrating, based on the embodiment shown in fig. 2, as shown in fig. 3, S202 includes S202a:
s202a, executing an instruction set by each computing instance to obtain an execution result of the instruction set corresponding to the computing instance. And in the time period of executing the instruction set by the computing instances, each computing instance receives data sent by other computing instances through the same receiving port and sends the data to the other computing instances through the same sending port.
Each compute instance uses the same send interface for sending table data in the database, which can be used to send database data to other compute instances, or to send results of execution of an instruction set to the master instance. Each computing instance uses the same receiving port to receive table data in the database, for example, it may be that data sent by other computing instances in the redistribution process is received, where the other computing instances are computing instances in the database system other than the computing instance.
In one possible implementation, each computing instance receives data sent by other computing instances based on UDP, and sends data to other computing instances based on UDP.
In another possible implementation, each computing instance receives data sent by other computing instances based on the RUDP and sends data to other computing instances based on the RUDP.
According to the embodiment, each computing instance receives data sent by other computing instances through the same receiving port, and sends the data to other computing instances through the same sending port, so that the accuracy of query is ensured, too many receiving or sending ports occupied by data transmission are reduced, the performance of a database system is improved, the requirements of mass data storage analysis and computation are met, and the deployment of a larger-scale database system is realized.
On the basis of the above embodiment, in the case that the instruction set includes a redistribution operation, further, if the instruction set involves database operations of a plurality of tables, the instruction set corresponding to each computing instance may be divided into sub-instruction sets by the database operations and the tables of the operations in the instruction set, so that threads corresponding to the number of sub-instruction sets are started in the computing instance, and in each computing instance, the corresponding sub-instruction sets may be executed in parallel by a plurality of threads, so that the execution efficiency of the database system may be improved.
However, in the existing method, data transmission is performed between threads in multiple threads started in a computing example based on TCP, a sending port and a receiving port are required to be allocated for establishing TCP connection between every two threads needing to transmit data, and because the number of TCP ports of the system is limited, the system may not complete a normal query task, so that a large-scale database system cannot be deployed. The following describes further how embodiments of the present disclosure solve the above-described problems with specific examples.
FIG. 4 is an interactive schematic diagram of a data transmission method of a database system according to another embodiment of the present disclosure, and FIG. 4 is a schematic diagram of an instruction set based on the embodiment shown in FIG. 2 or FIG. 3, further including a plurality of sub-instruction sets; the computing examples include a plurality of sub-computing examples, as shown in fig. 4, S202 includes S202b:
s202b, each computing instance starts a plurality of sub-instruction sets in the corresponding execution instruction set of a plurality of executors, wherein each executor receives data sent by other executors through the same receiving port and sends the data to other sub-computing instances through the same sending port.
The sub-instruction set may also be called as execution count partitioning (Slice), and is formed by dividing the instruction set by the sub-instruction set as a main instance according to a redistribution operation contained in the instruction set. The main instance can divide the instruction set according to a table containing non-distributed columns to obtain a plurality of sub-instruction sets, and the main instance sends the sub-instruction sets of each computing instance to the corresponding computing instance. For example, for data having a table with a non-distributed column, it is necessary to hash the data of the table with a non-distributed column and send the hashed data to each computing instance, the above-described hash redistribution operation for the table with a non-distributed column will be divided into one sub-instruction set. And the executor allocates storage resources and computing resources for each computing instance according to the received sub-instruction sets, wherein each sub-instruction set corresponds to one executor, and the executor executes the instruction of the corresponding sub-instruction set. In the process that the executors execute the corresponding sub-instruction sets, data transmission and reception can be generated among the executors, each executor receives data transmitted by other executors through the same receiving port and transmits the data to the other executors through the same transmitting port, wherein the other executors are started by a plurality of computing examples and are other than the executors.
In one possible implementation, each of the actuators receives data sent by the other actuator based on UDP and sends data to the other actuator based on UDP.
In another possible implementation, each of the actuators receives data sent by the other actuator based on the RUDP and sends data to the other actuator based on the RUDP.
Alternatively, the main instance may divide the instruction set into sub-instruction sets according to a data movement operation node (Motion). The data movement operation node is an operation node in the outgoing instruction set divided according to a plurality of tables in the instruction set and database operations performed on the tables. Dividing a complete instruction set into a plurality of sub-instruction sets from bottom to top by taking a data mobile operation node as a boundary, dividing each sub-instruction set into a part of an execution plan, dividing the data mobile operation node into a data sender and a data receiver to obtain an upper sub-instruction set and a lower sub-instruction set, wherein the lowest operation node of an upper execution plan dividing piece is a data receiving operation node, receiving redistributed data sent by a lower execution example, and the uppermost operation node of the lower execution dividing piece is a data sending operation node and is used for sending data to receiving ends of other upper execution machines.
Optionally, the sub instruction set includes an execution sequence identifier, where the execution sequence identifier is the execution sequence of the instruction included in the instruction set by the main instance, the instruction set is divided into execution sequence identifiers of the sub instruction sets of the sub instruction set by the sub instruction set time mark, for example, the execution sequence identifier may be a digital number, the number of the last executed sub instruction set is 1, the number of the sub instruction set executed after the next time is 2, and so on for numbering, and sub execution results obtained by executing the corresponding sub instruction set by each sub calculation instance are sent to the corresponding sub calculation instance according to the number.
In the following, referring to fig. 5, a method of this embodiment is described, and fig. 5 is a schematic flow chart of a database system for executing a query task, where, as shown in fig. 5, the database system includes a main instance and N computing instances, and the main instance establishes a TCP connection after receiving a client SQL request, and performs grammar, lexical, semantic analysis, query rewrite, and query optimization on an SQL statement to generate a distributed instruction set, which may also be called a distributed execution plan.
If the instruction set includes a redistribution operation, a data mobile operation node needs to be added to the distributed execution plan for data redistribution, and multiple data mobile operation nodes may exist in the distributed execution plan for a complex query task with multiple table connections. If there are data mobile operation nodes in the execution plan, the main instance can split the execution plan into M execution plan partitions from bottom to top with the data mobile operation nodes as boundaries, each execution plan partition is a part of instructions of the execution plan, and split the position of the data mobile operation node in the execution plan into a data sender and a data receiver, so as to obtain upper and lower execution plan partitions, wherein the lowest operation node of the upper execution plan partition is a data receiving operation node for receiving re-hashed data sent by the lower execution plan partition of each computing instance, and the uppermost operation node of the lower execution partition is a data sending operation node for redistributing local data to the upper execution partition receiving end of each computing instance.
The scheduler of the main instance establishes a TCP connection with each computing instance according to the execution meter fragments, each execution meter fragment corresponds to a query executor process of the computing instance based on the TCP connection, and then distributes a distributed execution plan and a fragment number to each query executor of each computing instance through the TCP connection network, for example, the query executor 1 sends the distributed execution plan and the fragment number 1, and the query executor M sends the distributed execution plan and the fragment number M.
Each computing instance establishes a RUDP network connection for transmitting data from each actuator of the computing instance to the main instance, and for the planned sharding with the data movement operation instance, it is also necessary to establish a RUDP network connection for transmitting data between each actuator of the computing instance, including a lower layer data transmitting port and an upper layer data receiving port.
And each query executor in each computing instance obtains a corresponding plan slice in the distributed execution plan according to the slice number, and then executes corresponding operation according to the execution plan slice. Each computing example is executed from bottom to top according to the execution metering and dividing, if the operation node for metering and dividing is a transmitting data node, if the operation node for metering and dividing is to be transmitted to other computing example upper layer query executors, re-hashed data is transmitted to each computing example upper layer executor through an internal network based on RUDP, if the operation node for metering and dividing is to be transmitted to a main example, the re-hashed data is not needed to be directly transmitted, if the operation node for executing and dividing is a receiving data node, the data of each computing example lower layer executor is received through the internal network of RUDP, so that the same hash value data is ensured to be in the same computing example, therefore, all computing examples can execute connection or grouping operation locally in parallel, the execution is completed, if the computing example has a plurality of executors, the result data is continuously transmitted to the computing example upper layer executor through the internal network of RUDP, and the top layer executor of the computing example transmits the data to the main example executor.
In this embodiment, after each computing instance receives an instruction set sent by a main instance, a plurality of executors are started to execute a plurality of sub-instruction sets in the instruction set correspondingly, and tasks of the execution instruction set are divided into the sub-instruction sets, so that the execution efficiency is improved, in the process that the executors execute the corresponding sub-instruction sets, each executor receives data sent by other executors through the same receiving port and sends the data to other executors through the same sending port, so that the correctness of query is ensured when a large amount of data generated in the process of executing the sub-instruction sets in parallel is migrated, the execution efficiency is improved, the excessive receiving or sending ports occupied by data transmission are reduced, the performance of a database system is improved, the requirements of mass data storage analysis and computation are met, and the deployment of a larger-scale database system is realized.
On the basis of the above embodiment, S203 further includes:
and the main instance obtains the result of the query instruction according to the execution result of the instruction set sent by the plurality of computing instances. As shown in fig. 5, the main instance performs aggregation connection on the execution results sent by the received computing instances, so as to obtain the result of the query instruction.
The master instance sends the results of the query instruction to the client based on TCP.
Optionally, after the main instance sends the result of the query instruction, that is, after the query execution is finished, the internal network of the RUDP dedicated to data movement may be cleared, and then the TCP connection from the main instance to each actuator of each computing instance is closed.
The following description of the data transmission method of the database system of the present disclosure will take 2 computing examples included in the database system as an example, and it should be understood that the following examples are only one possible implementation and are not limiting of the present disclosure.
Fig. 6 is a schematic flow chart of another database system executing a query task, as shown in fig. 6, assuming that a certain query request is to perform a connection query on a table S and a table T stored in the database system, where a connection condition is that a distribution column of the table S is equal to a non-distribution column of the table T, so that an instruction set generated by the query is to redistribute the table T according to the non-distribution column, and then perform a connection operation with the table S, so that a data moving operation node of the non-distribution column of the table T is added, and then split into an upper sub-instruction set and a lower sub-instruction set according to each data moving operation node, for example, split sub-instruction set 1 and sub-instruction set 2 herein. The sub instruction set 2 scans the table T data in sequence, and re-hashes the scanned table T line data and distributes the re-hashed table T line data to other executors. The sub instruction set 1 is used for receiving the re-hashed data of the table T, performing a connection operation on the re-hashed data of the table T and the table S scan data, and then transmitting the connection result to the main instance through the transmitting port. The main instance establishes TCP connections with the computing instances according to sub-instruction sets, each sub-instruction set corresponding to one TCP connection-based process, i.e., an executor, of the computing instance, and then distributes the sub-instruction sets and numbers to the executors of the computing instances. Sub-instruction set 1 corresponds to actuator 1 and sub-instruction set 2 corresponds to actuator 2. The method of the embodiment comprises the following steps:
1. each executor in each computing example obtains a sub-instruction set in the instruction set according to the number.
2. The plurality of executors initialize an internal network interface based on the RUDP protocol, establish a network for transmitting data between the executors in each executor, and a network for transmitting data from the top-level executor to the main instance, wherein the network comprises a data transmitting end and a data receiving end.
3. Each compute instance performs instruction operations in the instruction set from bottom up.
The sub instruction set 2 of the query executor 2 of each calculation example is executed first, the data of the table T is scanned sequentially, each row of data of the table T is read, then the data is delivered to the data transmitting end, then the data of each row of data of the table T is hashed by the transmitting end, a designated executor is obtained, and the data is transmitted to the executors 1 of the respective executors 1 through the internal network based on the RUDP.
The executor 1 of each calculation example uses the data of the table T sent by the executor 2 as the appearance, and performs connection operation with the data scanned by the table S, and the result of the connection operation is the execution result of the instruction set at this time.
4. Each computing instance transmits the execution result of the instruction set to the data receiving port of the main instance through the internal network based on the RUDP.
5. And the main instance performs convergent connection on the received execution results sent back by the executor 1 of the executor 1 in each calculation instance to obtain the result of the query instruction.
6. The master instance sends the results of the query instruction to the client over TCP.
Optionally, after the main instance sends the result of the query instruction, that is, after the query execution is finished, the internal network of the RUDP dedicated to data movement may be cleared, and then the TCP connection from the main instance to each actuator of each computing instance is closed.
The present disclosure provides a database system, which includes: a master instance and a plurality of computing instances. The database system is for performing a data transmission method of the database system as described above in any of fig. 2 to 5.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises an element.
The above is merely a specific embodiment of the disclosure to enable one skilled in the art to understand or practice the disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A data transmission method, characterized by being applied to a database system, the database system comprising: a master instance and a plurality of computing instances, the method comprising:
the main instance respectively sends instruction sets to the plurality of computing instances;
each computing instance executes the instruction set to obtain an execution result of the instruction set corresponding to the computing instance;
and the plurality of computing instances executing the instruction set send the execution results of the instruction sets respectively corresponding to the computing instances to the same receiving port of the main instance.
2. The method as recited in claim 1, further comprising:
each computing instance receives data sent by other computing instances through the same receiving port, and sends the data to the other computing instances through the same sending port.
3. The method of claim 1, wherein the instruction set comprises a plurality of sub-instruction sets;
each computing instance executes the instruction set to obtain an execution result of the instruction set corresponding to the computing instance, including:
each computing instance starts a plurality of executors to correspondingly execute a plurality of sub-instruction sets in the instruction set, and an execution result of the instruction set corresponding to the computing instance is obtained; each executor receives data sent by other executors through the same receiving port, and sends the data to the other executors through the same sending port.
4. The method of claim 2, wherein each computing instance receives data sent by other computing instances through the same receiving port and sends data to other computing instances through the same sending port, comprising:
each computing instance receives data sent by other computing instances based on the reliable user datagram protocol, RUDP, and sends data to other computing instances based on the RUDP.
5. The method of claim 2, wherein each computing instance receives data sent by other computing instances through the same receiving port and sends data to other computing instances through the same sending port, comprising:
each computing instance receives data sent by other computing instances based on a user datagram protocol UDP and sends data to other computing instances based on UDP.
6. A method according to claim 3, wherein each of the actuators receives data transmitted by other actuators via the same receiving port and transmits data to other actuators via the same transmitting port, comprising:
each of the actuators receives data transmitted from the other actuators based on the RUDP, and transmits data to the other actuators based on the RUDP.
7. A method according to claim 3, wherein each of the actuators receives data transmitted by other actuators via the same receiving port and transmits data to other actuators via the same transmitting port, comprising:
each of the actuators receives data transmitted by the other actuator based on the UDP, and transmits data to the other actuator based on the UDP.
8. The method of any of claims 1-7, wherein the plurality of computing instances executing the instruction set send results of execution of their respective corresponding instruction sets to a same receiving port of the master instance, comprising:
and the plurality of computing instances executing the instruction set send the execution results of the instruction sets respectively corresponding to the computing instances to the main instance based on RUDP.
9. The method of any of claims 1-7, wherein the plurality of computing instances executing the instruction set send results of execution of their respective corresponding instruction sets to a same receiving port of the master instance, comprising:
and the plurality of computing instances executing the instruction sets send the execution results of the instruction sets corresponding to the computing instances to the main instance respectively based on UDP.
10. A database system, the database system comprising a master node and a plurality of computing nodes; the master node is used for deploying a master instance, and the computing node is used for deploying a computing instance;
the database system being adapted to perform the data transmission method according to any of the preceding claims 1-9.
CN202011001547.5A 2020-09-22 2020-09-22 Data transmission method and database system Active CN112202859B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011001547.5A CN112202859B (en) 2020-09-22 2020-09-22 Data transmission method and database system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011001547.5A CN112202859B (en) 2020-09-22 2020-09-22 Data transmission method and database system

Publications (2)

Publication Number Publication Date
CN112202859A CN112202859A (en) 2021-01-08
CN112202859B true CN112202859B (en) 2024-02-23

Family

ID=74015839

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011001547.5A Active CN112202859B (en) 2020-09-22 2020-09-22 Data transmission method and database system

Country Status (1)

Country Link
CN (1) CN112202859B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9405634B1 (en) * 2014-06-27 2016-08-02 Emc Corporation Federated back up of availability groups
CN106250566A (en) * 2016-08-31 2016-12-21 天津南大通用数据技术股份有限公司 A kind of distributed data base and the management method of data operation thereof
CN106599043A (en) * 2016-11-09 2017-04-26 中国科学院计算技术研究所 Middleware used for multilevel database and multilevel database system
CN107070753A (en) * 2017-06-15 2017-08-18 郑州云海信息技术有限公司 A kind of data monitoring method of distributed cluster system, apparatus and system
CN109726250A (en) * 2018-12-27 2019-05-07 星环信息科技(上海)有限公司 Data-storage system, metadatabase synchronization and data cross-domain calculation method
CN109933631A (en) * 2019-03-20 2019-06-25 江苏瑞中数据股份有限公司 Distributed parallel database system and data processing method based on Infiniband network
CN110389900A (en) * 2019-07-10 2019-10-29 深圳市腾讯计算机系统有限公司 A kind of distributed experiment & measurement system test method, device and storage medium
CN111506602A (en) * 2020-04-20 2020-08-07 上海达梦数据库有限公司 Data query method, device, equipment and storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110219035A1 (en) * 2000-09-25 2011-09-08 Yevgeny Korsunsky Database security via data flow processing
US8676979B2 (en) * 2010-05-18 2014-03-18 Salesforce.Com, Inc. Methods and systems for efficient API integrated login in a multi-tenant database environment
US8682876B2 (en) * 2012-04-03 2014-03-25 Sas Institute, Inc. Techniques to perform in-database computational programming
US10275184B2 (en) * 2014-07-22 2019-04-30 Oracle International Corporation Framework for volatile memory query execution in a multi node cluster

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9405634B1 (en) * 2014-06-27 2016-08-02 Emc Corporation Federated back up of availability groups
CN106250566A (en) * 2016-08-31 2016-12-21 天津南大通用数据技术股份有限公司 A kind of distributed data base and the management method of data operation thereof
CN106599043A (en) * 2016-11-09 2017-04-26 中国科学院计算技术研究所 Middleware used for multilevel database and multilevel database system
CN107070753A (en) * 2017-06-15 2017-08-18 郑州云海信息技术有限公司 A kind of data monitoring method of distributed cluster system, apparatus and system
CN109726250A (en) * 2018-12-27 2019-05-07 星环信息科技(上海)有限公司 Data-storage system, metadatabase synchronization and data cross-domain calculation method
CN109933631A (en) * 2019-03-20 2019-06-25 江苏瑞中数据股份有限公司 Distributed parallel database system and data processing method based on Infiniband network
CN110389900A (en) * 2019-07-10 2019-10-29 深圳市腾讯计算机系统有限公司 A kind of distributed experiment & measurement system test method, device and storage medium
CN111506602A (en) * 2020-04-20 2020-08-07 上海达梦数据库有限公司 Data query method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于物联网的电火花线切割机床远程监控管理授权系统;邹逸君等;电加工与模具;全文 *

Also Published As

Publication number Publication date
CN112202859A (en) 2021-01-08

Similar Documents

Publication Publication Date Title
US11099917B2 (en) Efficient state maintenance for execution environments in an on-demand code execution system
Snir et al. The communication software and parallel environment of the IBM SP2
US20150172412A1 (en) Managing dependencies between operations in a distributed system
CN107818129B (en) Query restartability
Wesolowski et al. Tram: Optimizing fine-grained communication with topological routing and aggregation of messages
CN1710865A (en) Method for raising reliability of software system based on strucural member
CN103793273A (en) Distributed type queue scheduling method and device based on Redis
CN107807983A (en) A kind of parallel processing framework and design method for supporting extensive Dynamic Graph data query
CN103106261B (en) Based on the distributed enquiring method of arrowband cloud data, services
Squyres et al. The interoperable message passing interface (IMPI) extensions to LAM/MPI
CN112202859B (en) Data transmission method and database system
Costan From big data to fast data: Efficient stream data management
CN114996299A (en) Plan execution method, device and system for distributed database
CN113590323A (en) MapReduce-oriented data transmission method, device, equipment and storage medium
Saad et al. Wide area bonjourgrid as a data desktop grid: Modeling and implementation on top of redis
CN113177089A (en) Distributed data storage engine scheduling method
US11586632B2 (en) Dynamic transaction coalescing
Dimitrov Cloud programming models (MapReduce)
KR101952651B1 (en) Method and apparatus for generating unique identifier for distributed computing environment
JP2007507762A (en) Transparent server-to-server transport of stateless sessions
Yuan et al. SMPI: Scalable Serverless MPI Computing
CN111078635B (en) Data processing method based on Hadoop
US10915373B2 (en) Enabling rewire-aware MapReduce cluster in disaggregated systems
Revathi Performance Tuning and scheduling of Large data set analysis in Map Reduce Paradigm by Optimal Configuration using Hadoop
Chiu et al. Massage-Passing Interface Cluster Bulid upon System Kernel Environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant