WO2022143685A1 - 一种使用网络设备进行数据查询的系统、方法、及装置 - Google Patents
一种使用网络设备进行数据查询的系统、方法、及装置 Download PDFInfo
- Publication number
- WO2022143685A1 WO2022143685A1 PCT/CN2021/142146 CN2021142146W WO2022143685A1 WO 2022143685 A1 WO2022143685 A1 WO 2022143685A1 CN 2021142146 W CN2021142146 W CN 2021142146W WO 2022143685 A1 WO2022143685 A1 WO 2022143685A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- task
- node
- offloadable
- data
- network device
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 122
- 238000012545 processing Methods 0.000 claims abstract description 61
- 230000008569 process Effects 0.000 claims description 37
- 238000004891 communication Methods 0.000 claims description 10
- 230000005540 biological transmission Effects 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 description 39
- 238000004220 aggregation Methods 0.000 description 39
- 238000001914 filtration Methods 0.000 description 24
- 238000010586 diagram Methods 0.000 description 16
- 238000013467 fragmentation Methods 0.000 description 12
- 238000006062 fragmentation reaction Methods 0.000 description 12
- 239000012634 fragment Substances 0.000 description 11
- 238000013461 design Methods 0.000 description 10
- 239000000047 product Substances 0.000 description 9
- 230000003993 interaction Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000004044 response Effects 0.000 description 6
- 230000001965 increasing effect Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000004048 modification Effects 0.000 description 5
- 230000009286 beneficial effect Effects 0.000 description 4
- 238000013500 data storage Methods 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 230000001960 triggered effect Effects 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/903—Querying
- G06F16/90335—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/5038—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24532—Query optimisation of parallel queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/5017—Task decomposition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2209/00—Indexing scheme relating to G06F9/00
- G06F2209/50—Indexing scheme relating to G06F9/50
- G06F2209/509—Offload
Definitions
- the present application relates to the field of computer technology, and in particular, to a system, method, and apparatus for performing data query using network equipment.
- the current main method is to increase the hardware resources for processing data, such as increasing the number of processors (such as central processing unit, CPU) of each node in the query system.
- processors such as central processing unit, CPU
- Processing power and memory capacity but increasing the processing power and memory capacity of the processor leads to an increase in the cost of the product.
- the room for growth of the processing power of the processor is limited, so sometimes the query efficiency cannot be improved by enhancing the processing power of the processor.
- the present application provides a data query method and device for accelerating data processing without enhancing CPU performance and/or memory capacity.
- the present application provides a data query system.
- the system includes a central node, a working node and a network device, and the central node is connected to the working node through the network device.
- the central node generates multiple tasks from the query request input by the user.
- the central node configures the execution devices of some tasks as network devices and the execution devices of some tasks as work nodes, and then sends configuration instructions to configure the network devices and work nodes corresponding task.
- the pre-configured tasks can be executed on the data.
- the central node can configure some tasks to be executed by the network device.
- the network device executes the pre-configured task and then forwards it to other execution devices.
- the calculation amount of the worker nodes is reduced, and the burden on the processors on the worker nodes is reduced, so that data processing can be accelerated without increasing the processing capacity of the processors of the worker nodes.
- the central node after generating multiple tasks, is configured to search for an offloadable task among the multiple tasks, and set the execution device of the offloadable task as the network device , the offloadable task is a preset task offloaded to the network device for execution.
- the offloadable task suitable for offloading to the network device can be preconfigured, and the offloadable task can be conveniently and quickly found from the multiple tasks.
- the central node is configured to send a setting instruction of the offloadable task to the network device; the network device is configured to set the offloadable task according to the setting instruction.
- the network device is a network card of a working node or a forwarding device, and the forwarding device may be a switch or a router.
- the forwarding device includes a data port and a control port; the central node is configured to send the setting instruction of the offloadable task to the forwarding device through the control port of the forwarding device;
- the setting instruction of the task whose execution device is a network card or a working node is sent to the forwarding device through the data port; correspondingly, when the forwarding device receives the setting instruction through the control port, it sets the offloadable task indicated in the setting instruction;
- the data port receives the setting command, it forwards the setting command received from the data port.
- the forwarding device can quickly distinguish the data packets that need to be forwarded according to the data port, and forward them to the corresponding device without parsing the data packets, reducing the sending delay; and distinguishing the settings sent by the central node according to the control port command to avoid mis-forwarding and missing configuration.
- the central node when the network device that executes the offloadable task is a network card in the working node, the central node is used to send the setting instruction of the offloadable task to the working node, and the working node performs the offloading task according to the setting instruction.
- the offloadable task is set on the network card of the worker node.
- the work node sets the offloadable tasks on the network card according to the setting instructions.
- the work node integrates the offload policy, it can also determine whether to offload to the network card according to the actual load status of the network card. In this way, the work node controls the network card to execute. The way to offload tasks is more flexible.
- the setting instruction of the offloadable task includes an offloadable flag; after receiving the setting instruction, the worker node sets the offloadable task on the network card of the worker node when it is determined that the setting instruction includes the offloadable flag .
- the offloadable task is executed on the data packet.
- the network device can monitor the data packet of the offloadable task according to the identifier of the offloadable task, and execute the offloadable task on the data packet, which does not require a separate execution instruction, and can quickly and accurately identify the offloadable task executed by the network device. tasks, saving overhead while speeding up data processing.
- the central node is further configured to, after determining the offloadable task, send a setting instruction of the offloadable task when it is determined that the offloadable task conforms to the uninstallation policy corresponding to the offloadable task.
- the task is used to indicate the operation to be executed and the operand, and the operand is the data of the executed operation;
- the setting instruction of the task may include a task identifier and operator information, wherein the task identifier is used to uniquely Identifies a task in a query request.
- the operator information includes the operator ID.
- the operator ID uniquely identifies an operator.
- An operation can be completed by running one or more operators. The operator is run to perform tasks on the operands. the indicated action.
- the offloadable task means that the operators required to complete the task are all offloadable operators; wherein, the offloadable operators may be preset; exemplarily, the offloadable operators include : filter (filtering) operator, aggregation (aggregation) operator, distinct (non-empty unique) operator, TopN (top N value) operator, Join (table union) operator, etc.; or the unloadable operator is satisfied
- the preset operator of the corresponding unloadable strategy exemplarily, the preset operator and its corresponding unloadable strategy include: a filter operator, and its corresponding unloadable strategy is that the selection rate of executing the filter operator in the filter column is not low
- a preset threshold such as the first preset value
- the aggregation operator, the corresponding unloading strategy is that when the aggregation operator is executed on the aggregation column, the cardinality of the aggregated data on the aggregation column does not exceed the second preset value.
- the first preset value, the second preset value or the third preset value It can be the exact same value, or not the exact same value, or a completely different value.
- the present application provides a data query method, which can be applied to a central node, where the central node is connected to a working node through a network device.
- the method includes: the central node generates a plurality of tasks from a query request input by a user. When allocating execution devices to the multiple tasks, the central node configures the execution devices of some tasks as network devices and the execution devices of some tasks as work nodes, and then sends a setting instruction to configure the network devices and work nodes corresponding task.
- the working node or the network device after the working node or the network device has set the task indicated by the setting instruction, it sends a feedback response to the central node, which is used to indicate that the configuration of the task issued by the central node has been completed.
- An execution instruction of the query request may be sent, where the execution instruction is used to trigger the execution device to execute the set task.
- the determining the execution device of each task in the plurality of tasks includes: after generating the plurality of tasks, the central node searches the plurality of tasks for an offloadable task, The execution device of the offloadable task is set as a network device, and the offloadable task is a preset task that is offloaded to the network device for execution.
- the central node when it is determined that the execution device of the offloadable task is a network device, the central node sends a setting instruction of the offloadable task to the network device.
- the network device may be a network card of a working node or a forwarding device, and the forwarding device may be a switch or a router.
- the central node when it is determined that the network device executing the offloadable task is the network card of the working node, the central node sends the setting instruction of the offloadable task to the working node, and the working node controls the setting of the offloadable task in the network card. Uninstall tasks.
- the central node when determining that the network device performing the offloadable task is a forwarding device, the central node sends a setting instruction of the offloadable task to the forwarding device.
- an uninstallable flag is carried in the setting instruction.
- the offloadable task may be preset, and after the central node determines the offloadable task among multiple tasks, and determines that the offloadable task conforms to the offloading policy corresponding to the offloadable task, the central node sends the The offload task's setup instructions to the network device.
- the central node may also follow the preset priorities of the devices corresponding to the offloadable task. to determine the execution device.
- the central node when determining the execution device of the offloadable task, may also determine the execution device according to the priority of each device corresponding to the preset offloadable task and the load status of each device.
- the present application provides a data query method, which can be applied to a network device.
- the network device is used to connect a central node and a working node.
- the method includes: the network device receives a setting instruction sent by the central node, and according to the setting The instruction sets the corresponding task and executes the task on the data packets flowing through the network device.
- the network device may be a network card of a working node or a forwarding device, for example, the forwarding device is a switch or a router.
- the forwarding device includes a data port and a control port.
- the forwarding device can receive setting instructions from the control port, and the data received through the control port is configured by the central node to the forwarding device, and the forwarding device sets the offloadable tasks according to the setting instructions received by the control port; Set the instruction. If the data received through the data port is configured by the central node to other devices other than the forwarding device, the forwarding device forwards the data received from the data port.
- the offloadable task is executed based on the data packet.
- the embodiments of the present application further provide a data query interface, including: a query command input area, a task display area, and an execution device display area;
- the query command input area is used to receive the query request input by the user
- a task display area configured to display a plurality of tasks generated according to the query request to execute the query request
- the execution device display area is used to display the execution devices of each task, and the execution devices include work nodes and network devices.
- the query command input area, the task display area, and the execution device display area are displayed on the same interface.
- the query command input area, the task display area, and the execution device display area are displayed on different interfaces.
- the embodiments of the present application also provide a data query interaction method, the method can be applied to a central node, and the central node is a server of a client, the method includes: a user inputs a query request on the client, and the client forwards the query request To the central node, correspondingly, the central node receives the query request and generates multiple tasks based on the query request; further, the central node generates an execution plan of the query request, and the execution plan includes the information of the execution equipment of each task , where the central node can assign tasks to working nodes for execution, or assign tasks to network devices for execution, that is, the executing devices can be working nodes or network devices.
- the central node can display the execution plans of the multiple tasks locally, or the central node can send the execution plan to the client.
- the client can display the execution plan, including displaying the execution plan. Multiple tasks and the execution device for each task.
- the multiple tasks on the client are displayed in a tree form according to the execution plan.
- the execution progress of the multiple tasks is displayed.
- the user can intuitively understand the execution plan of the query request and the query progress, so as to improve the user's participation and use experience.
- an embodiment of the present application further provides a central device, where the device includes a plurality of functional units, and the functional units can perform the functions performed by each step in the method of the second aspect.
- These functional units can be implemented by hardware or by software.
- the device includes a detection unit and a processing unit.
- an embodiment of the present application further provides a network device, the device includes a plurality of functional units, and the functional units can perform the functions performed by each step in the method of the third aspect. These functional units can be implemented by hardware or by software.
- the device includes a detection unit and a processing unit.
- an embodiment of the present application further provides a central device, the device includes a processor, a memory, and a transceiver, wherein program instructions are stored in the memory, and the processor executes the program instructions in the memory, through The transceiver communicates with other devices to implement the method provided by the second aspect.
- an embodiment of the present application further provides a network device, the device includes at least one processor and an interface circuit, where the processor is configured to communicate with other devices through the interface circuit, so as to implement the third aspect. method.
- the processor may be a programmable gate array (field programmable gate array, FPGA), a data processing unit (data processing unit, DPU), a graphics processing unit (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit) integrated circuit, ASIC), system on chip (system on chip, SOC).
- FPGA field programmable gate array
- DPU data processing unit
- GPU graphics processing unit
- ASIC application specific integrated circuit
- SOC system on chip
- the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the computer-readable storage medium runs on a computer, the computer executes the method provided in the second aspect or the method provided in the third aspect. provided method.
- FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the present application.
- FIG. 2 is a schematic diagram of a query system architecture provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of the internal structure of a working node according to an embodiment of the present application.
- FIG. 4 is a schematic diagram of a network architecture provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of another network architecture provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram corresponding to a data query method provided by an embodiment of the present application.
- FIG. 7 is a schematic interface diagram of an execution plan provided by an embodiment of the present application.
- FIG. 8 is a schematic interface diagram of another execution plan provided by an embodiment of the present application.
- FIG. 9 is a schematic diagram of network card resource allocation according to an embodiment of the present application.
- FIG. 10 is a schematic flowchart corresponding to another data query method provided by an embodiment of the present application.
- FIG. 11 is a schematic structural diagram of a device provided by an embodiment of the application.
- FIG. 12 is a schematic diagram of an apparatus structure of a network device according to an embodiment of the present application.
- the network architecture and service scenarios described in the embodiments of the present invention are for the purpose of illustrating the technical solutions of the embodiments of the present invention more clearly, and do not constitute a limitation on the technical solutions provided by the embodiments of the present invention.
- the evolution of the architecture and the emergence of new business scenarios, the technical solutions provided by the embodiments of the present invention are also applicable to similar technical problems.
- FIG. 1 it is a schematic diagram of a system architecture to which the embodiments of the present application may be applied.
- the system includes a client 10 , a query system 20 and a data source 30 .
- the client 10 is a computing device on the user side, such as a desktop computer, a notebook computer, and the like.
- the client 10 is provided with a processor and a memory (not shown in FIG. 1 ).
- a client program runs on the client 10 .
- the client program is configured to receive the query request triggered by the user, and interact with the query system 20 , for example, send the query request to the query system 20 .
- the query system 20 runs a server program for interacting with the client program, such as receiving a query request sent by the client 10, and the query system 20 is also used to obtain the query request from the data source 30 for query.
- the original data is obtained, and the original data is calculated or processed to obtain a query result (or target data). Subsequently, the query system 20 returns the obtained query result to the client 10 .
- the data source 30 may refer to a database or a database server. In this embodiment, it refers to a data source that can be analyzed by the query system, such as a MySQL data source, an Oracle data source, and an HIVE data source, where the data storage format can be an HDFS (hadoop distributed file system, Hadoop distributed file system) file , ORC (Optimized Row Columna,) files, CSV (comma-separated values, comma-separated values) files, or semi-structures such as XML (eXtensible markup language, extensible markup language), Json (javascript object notation, object notation) data.
- HDFS hadoop distributed file system, Hadoop distributed file system
- ORC Optimized Row Columna,
- CSV complex value
- Json javascript object notation, object notation
- the data source can use distributed storage.
- the data source can include one or more storage nodes, where the storage nodes can be storage servers, desktop computers, or storage array controllers, hard disks, etc.
- the query system can adopt a massively parallel processing (MPP) architecture, for example, the Presto query engine, which is an open source MPP SQL (structured query language, structured query language) query engine , that is, a distributed SQL query engine, which is used to query large data collections distributed in one or more different data sources, and is suitable for interactive analytical queries.
- MPP massively parallel processing
- the Presto query engine which is an open source MPP SQL (structured query language, structured query language) query engine , that is, a distributed SQL query engine, which is used to query large data collections distributed in one or more different data sources, and is suitable for interactive analytical queries.
- MPP architecture refers to distributing tasks to multiple servers or nodes in parallel, and executing tasks in parallel on each server or node.
- a student information table contains information such as the student's name, age, and student ID.
- the user triggers a query request to query the student whose name is "Xiao Ming" in the student information table, then in the query system of the MPP architecture, multiple nodes can be queried based on some rows in the student information table, so that The query time can be shortened, thereby reducing the total query time and improving query efficiency. It should be understood that the more nodes participating in the query, the shorter the query time required for the same query request.
- the query system in this embodiment mainly includes: a central node cluster and a working node cluster.
- the central node cluster includes one or more central nodes (only two central nodes 100 and 101 are shown in FIG. 2 , but the application does not limit the number of central nodes).
- a worker node cluster includes one or more worker nodes (three worker nodes 20a, 20b, and 20c are shown in FIG. 2, but the application is not limited to three worker nodes).
- the central node is used to receive the query request sent by the client, parse the received query request into one or more tasks, and then deliver the one or more tasks to multiple worker nodes in parallel, Multiple worker nodes can process assigned tasks in parallel. It should be understood that the central node can allocate tasks to some of the work nodes in the query system in parallel, or to all work nodes. In addition, the tasks assigned to each work node may be exactly the same or not the same, or It is completely different, and the embodiments of the present application also do not limit this. It should be noted that the central node can be a node selected by each working node to assume the function of the central node from the working nodes, or it can be a specific device.
- a query request sent by the client will be routed to any one of the multiple central nodes.
- multiple central nodes in the query system can simultaneously respond to multiple query requests, and the multiple query requests can be sent by multiple clients or sent by one client.
- Worker nodes are used to receive tasks sent by the central node and execute tasks.
- the executed tasks include acquiring the data to be queried from the data source and performing various computing processes on the acquired data. Since each task can be processed in parallel by each worker node, the results of the parallel processing are finally aggregated and fed back to the client.
- the central node and the working node include at least a processor, a memory and a network card.
- the connection relationship and working mode of the above-mentioned hardware will be described in detail by taking the working node 20a as an example.
- the working node 20a is taken as an example for description below. Please refer to FIG. 3, which is a schematic diagram of the internal structure of the working node 20a. As shown in FIG. 3 , the working node 20 a mainly includes a processor 201 , a memory 202 , and a network card 203 . The processor 201, the memory 202, and the network card 203 communicate with each other through the communication bus.
- the processor 201 may be a central processing unit (central processing unit, CPU), which may be used for computing or processing data.
- the storage 202 refers to a device for storing data, and the storage includes a memory and a hard disk. Among them, the memory can read and write data at any time, and the reading speed is very fast, and it can be used as a temporary data storage for the running program.
- Memory includes at least two types of memory, such as Random Access Memory (RAM) and Read Only Memory (ROM). Compared to memory, hard disks are slower to read and write data and are often used to store data persistently.
- Hard disk types include at least a solid state disk (solid state disk or solid state drive, SSD), a mechanical hard disk (mechanical hard disk, HDD), or other types of hard disks.
- the data in the hard disk needs to be read into the memory first, and the processor 201 or the computing unit 221 obtains the data from the memory.
- the memory resources of the processor 230 and the memory resources of the computing unit 221 may be shared, or may be independent of each other, which is not limited in this embodiment of the present application.
- the network card 203 is used to realize data interaction and data processing.
- the network card 203 at least includes a communication unit 220 and a computing unit 221 (FIG. 3 shows a computing unit as an example, but this application does not limit it).
- the communication unit 220 can provide efficient network transmission capability for receiving data input from an external device or sending data output from the device.
- the computing unit 221 includes but is not limited to: a programmable gate array (field programmable gate array, FPGA), a data processing unit (data processing unit, DPU), a graphics processor (graphics processing unit, GPU), an application specific integrated circuit (application specific integrated circuit) integrated circuit, ASIC), system on chip (system on chip, SOC).
- FPGA field programmable gate array
- DPU data processing unit
- graphics processor graphics processor
- ASIC application specific integrated circuit
- system on chip system on chip
- FPGAs have the generality and programmability of a CPU, but are more specialized, operating efficiently on network packets, storage requests, or analytics requests. FPGAs are differentiated from CPUs by a greater degree of parallelism (the need to process a large number of requests).
- a worker node or a central node may be deployed on at least one physical node, for example, a worker node and a central node may be deployed on the same server, and in another example, a central node and a worker node may be deployed separately On two servers that are independent of each other, they are not listed here.
- the storage node may also be an independent device, such as a storage server. It should be noted that the above node may be deployed on a physical machine or a virtual machine, which is not limited in this application.
- the system shown in FIG. 1 also includes a forwarding device, such as a switch or a router.
- a forwarding device such as a switch or a router.
- a switch is used as an example below.
- the switch can be used for data forwarding.
- FIG. 4 is a schematic diagram of a physical architecture in an actual application scenario of a query system provided by an embodiment of the present application.
- the working node cluster includes working node 20d and working node 20e in addition to the working node 20a, working node 20b, and working node 20c in FIG. 3;
- the central node cluster only includes the central node 100 ;
- the data source includes a storage node 30a, a storage node 30b, and a storage node 30c;
- the forwarding device includes a switch 10, a switch 20, a switch 30, and a switch 40.
- the central node, the working node, and the storage node are all independent physical machines.
- the working node 20a, the working node 20b, the central node 100 and the switch 10 are installed on the rack 1 (rack 1);
- the working node 20c, the working node 20d, the working node 20e and the switch 20 are installed on the rack 2;
- the storage node 30a, The storage node 30b, the storage node 30c, and the switch 30 are installed on the rack 3.
- FIG. 1 the working node 20a, the working node 20b, the central node 100 and the switch 10
- the rack 1 rack 1
- the working node 20c, the working node 20d, the working node 20e and the switch 20 are installed on the rack 2
- the storage node 30a, The storage node 30b, the storage node 30c, and the switch 30 are installed on the rack 3.
- the nodes in the same rack can interact through the switches in the rack.
- the central node 100 can exchange data with any other node in the rack 1 through the switch 10, for example, with the working node 20a.
- the data exchanged between the central node and the working node includes at least a header part and a data part, wherein the header part includes a source IP address and a destination IP address, and the data part is the data to be transmitted.
- the process of the central node 100 sending data to the working node 20a may include: the central node 100 sends data to the working node 20a, the source IP address in the data is the IP address of the central node 100, and the destination IP address is the working node 20a.
- the central node 100 sends the data the data is first routed to the switch 10, and the switch 10 forwards the data to the working node 20a according to the destination IP carried by the data.
- FIG. 4 also includes a switch 40, wherein the switch 40 is the core switch in the system.
- the switch 10 Relative to the core switch, the switch 10, the switch 20 or the switch 30 may also be referred to as the top switch (tor switch) in the respective rack. swtich).
- the core switch can be used to realize data interaction between nodes on different racks. For example, when the nodes on the rack 1 and the nodes on the rack 3 perform data interaction, it can be implemented through the switch 10 , the core switch 40 and the switch 30 .
- the data transmission path is as follows: the data is first routed to the switch 10, and the switch 10 checks that the destination IP address and its own IP address are not in the same network segment, then the data is sent to the switch 10.
- the data is forwarded to the core switch 40, the core switch 40 forwards the data to the switch 30 on the same network segment as the destination IP address, and the switch 30 forwards the data to the storage node 30a corresponding to the destination IP address.
- the installation position of the switch 40 is not limited, for example, it can be installed on any rack in FIG. 4 .
- the switch in the embodiment of the present application also has computing and data processing capabilities, such as a programmable switch.
- FIG. 4 is only an example.
- FIG. 5 provides another schematic diagram of the system architecture for this embodiment of the present application.
- the data in the source may be stored in the hard disk of the working node, and the embodiment of the present application does not limit the system architecture and the deployment form of each node.
- the present application provides a data query method.
- the central node receives the query request sent by the client, and parses the query request into one or more tasks.
- Nodes can offload some of these tasks to network devices for processing.
- network devices are network cards or forwarding devices of working nodes.
- the calculation amount of working nodes is reduced and the number of working nodes is reduced. Therefore, the data processing speed can be improved without increasing the hardware resources of the worker nodes.
- FIG. 6 is a schematic flowchart of a data query method provided by an embodiment of the present application.
- the central node determines the offloadable task among the multiple tasks, and instructs the worker nodes to offload the offloadable task to the network card for processing.
- the method can be applied to the system architecture shown in FIG. 4 or FIG. 5, and the method mainly includes the following steps:
- Step 601 the client sends a query request to the central node, and correspondingly, the central node receives the query request sent by the client.
- the query request is triggered by the user on the client.
- the query system is an MPP SQL engine, and the query request can be an SQL statement.
- the following uses the SQL statement as an example to introduce the query request. Statements are not qualified.
- Table 1 (named factTbl) is the commodity sales record of a merchant, which is used to record the flow of the merchant.
- Table 2 (named dimTbl) is a commodity name table, which is used to record the identifiers and commodity names of commodities sold by the merchant. It should be understood that Tables 1 and 2 show only part of the data.
- the query request is that the user wants to query the total sales of each commodity in Table 2 on the date of 2020/01/02 in Table 1.
- the SQL statement corresponding to the query request is as follows:
- Step 602 the central node parses the query request into one or more tasks.
- the query request may be divided into one or more tasks.
- the tasks include some or all of the following information: information of data to be operated, operator information, and operation rules.
- the information of the data to be operated is used to indicate the data to be operated, and the data to be operated is the object to be operated;
- the operator information includes the identifier of the operator, and the identifier is used to indicate the operator, and one type of operator represents one type of execution operation;
- Operation rules refer to the rules for performing operations, and can also be understood as the rules of operators.
- the tablescan operator which represents a scan operation, is used to read all rows in all pages of a table according to the order in which the rows are stored in the database.
- the Filter operator which represents a filtering operation, is used to filter the filter columns in the table according to the operation rules (or filter conditions) to obtain rows that meet the filter conditions.
- Join operator which represents the table join operation, is used to reorganize and join two tables according to the conditions of a certain column or multiple columns, and is mostly used for a certain item or Multiple items of data can be used to filter data in a large table (a table with a relatively large amount of data, such as Table 1), and data in a small table can also be combined with data in a large table.
- the process of using the Join operator includes: maintaining a Bloom Filter (BloomFilter, BF) on the column on the On condition of the small table that requires Join (for example, the ID column of Table 2), and then scanning the large table, When scanning a large table, match the value of the column on the On condition (such as the ID column of Table 1) in each row scanned with the BF. If it does not exist in the BF, discard the row, and if it exists, keep it .
- the Jion operator can be used to combine some columns in a small table with some columns in a large table, for example, combine the name columns of table 1 and table 2 based on the same ID value in table 1 and table 2.
- the jion operator includes broadcast jion and hash jion, wherein, assuming that the data that needs jion includes table 2, the operation process of broadcast jion is, one worker node reads the complete table 2, and then broadcasts the complete table 2 to each worker node that executes the jion operator.
- the hash jion can be that multiple worker nodes read one or more shards in Table 2 (the shards will be described in detail below), and then send the shards they read to other worker nodes, so that Each worker node can obtain a complete table 2 based on the shards of table 2 read by other worker nodes, and then execute the jion based on table 2.
- the group by operator which represents the grouping operation, is used to group according to a certain condition, such as grouping according to the product name.
- Aggregation operators which represent aggregation operations, mainly include: Sum aggregation operator, Min aggregation operator, Max aggregation operator, count aggregation operator, AVG aggregation operator, where the Sum aggregation operator is used to evaluate the values that need to be aggregated and; the Min aggregation operator is used to maintain the minimum value in the values that need to be aggregated; the Max aggregation operator is used to maintain the maximum value in the values that need to be aggregated; the count aggregation operator is used to count the number of values that need to be aggregated; The AVG aggregation operator is used to maintain the mean of the cumulative sum over the values that need to be aggregated.
- the execution process of using the aggregation operator is as follows: first, group by the group by column, for example, group by dimTbl.name, that is, group by the name column of Table 2, and then do sum, min, Operations like max, conut, or Avg. In this embodiment, the sum operation is performed.
- the Distinct operator which represents the deduplication operation, is used to select non-empty unique columns, or to remove duplicate data. Specifically, deduplication is performed according to each row with data (non-empty) in the ditinct column. For example, when determining how many commodities are included in table 2, the name column in table 2 is the ditinct column, scan the name column in each row line by line, and record the name if it is a name that does not appear. If the name appears again, the record will not be repeated, that is, the name of each commodity in the name column is recorded only once, so that it can be counted that Table 2 contains a total of multiple commodities.
- the TopN operator represents the operation of maintaining the maximum N value. Specifically, it is used to maintain the current maximum N values. When a new value comes in, if the new value is greater than the minimum of the current maximum N values, the current value will be replaced. The minimum of the maximum values.
- a task may include executing one or more operations of an SQL statement, that is, a task may be performed using one or more operators.
- the execution plan of the above SQL statement includes: (1) scan table 2, read all rows in table 2; (2) scan table 1, read all rows in table 1; (3) filter out the table The row data in 1 with the date 2020/01/02; (4) filter out the row in the ID column in table 1 with the same ID value as the table 2; according to the same ID value, the date in table 1 is 2020/01/
- the row of 02 is combined with the name column in Table 2; (5) Based on the combined row data, group by name to obtain multiple groups of goods, and calculate the total sales of each group of goods separately.
- the execution plan can be divided into multiple stages, and a stage can include one or more tasks.
- stages can be divided according to whether nodes need to interact, and tasks contained in the same stage do not need to depend on the results of other nodes.
- FIG. 7 is a schematic diagram of an execution plan generated by parsing the above SQL statement.
- the execution plan shown in FIG. 7 can be generated. Specifically, the user can input the query statement in the query command input area (not shown in FIG. 7 ) of the data query interface of the client. As shown in FIG. 7, the execution plan can be displayed to the user through the display interface of the central node or the client.
- the execution plan can be directly displayed to the user after the user inputs the query instruction, or when the user needs to view the execution plan, the execution plan can be displayed by inputting an instruction to display the execution plan, and then the execution plan can be displayed to the user through the interface.
- the execution plan will also display the execution device of each task, such as a network card, a working node, a router, or a switch.
- the execution plan may be displayed simultaneously with the task, or the execution device may be displayed again when the user clicks on the task.
- the interface for the user to input the query instruction and the interface for displaying the execution plan may be the same interface or different interfaces.
- the execution plan includes stage1, stage2, stage3, and stage4.
- stage3 and stage4 are in a parallel relationship and can be executed synchronously;
- stage2 is the next stage of stage3 (or stage4), correspondingly, stage3 (or stage4) is the previous stage of stage2;
- stage1 is the next stage of stage2, corresponding to , stage1 is the previous stage of stage2, and so on.
- the execution plan based on the above SQL statement can be split into the following tasks:
- Task 1 Scan table 1 and filter out the row data in table 1 whose date is 2020/01/02.
- task 1 can be completed by using the tablescan operator and the Filter operator, and the tablescan operator performs the scan operation.
- the operation performed by the Filter operator is a filtering operation
- Table 1 is the data to be operated
- Task 2 Read Table 2. Task 2 can be done using the tablescan operator.
- Task 3 Combine table 1 and table 2, specifically, perform task 3 to filter out the row whose ID column is the same as the ID value of table 2 in the row whose date is 2020/01/02 in table 1; according to the same ID value, respectively combined with the name column in Table 2.
- task 3 can be completed using the Jion operator.
- Task 4 Grouping tasks, based on the results obtained in Task 3, group by product name. Task 4 can be done using the group by operator.
- Task 5 Partial aggregation: Based on the grouping results of Task 4, sum the sales of each group of products to obtain the total sales of each group of products. Task 5 can be done using the aggeration operator. It should be understood that each worker node will be assigned to process one or more shards in Table 1 (the shards will be described in detail below, and will not be highlighted here), that is, each worker node that executes task 5 The sales of a group of products are only summarized based on some data in Table 1, so task 5 can also be understood as a partial aggregation.
- Task 6 final aggregation, which is to determine the final query result based on all partial aggregation results.
- the data to be operated is the result of executing task 5 for each worker node assigned task 5
- the execution operation is the summation
- the execution rule is based on the result of each worker node executing task 5 executing task 5.
- the sales of the same product are summed to obtain the final query result.
- the logical relationship between tasks is that task 3 is the next task of task 1 and task 2, task 1 and task 2 are the previous task of task 3 respectively, and task 4 is the next task of task 3 A task, correspondingly, task 3 is the previous task of task 4; that is, the output data of task 1 and task 2 are the input data of task 3, the output data of task 3 is the input data of task 4, and so on.
- the node that executes the next task is the next-level node of this node, for example, the node that executes task 3 is the next-level node of the node that executes task 1, and so on.
- different tasks have different task identifiers (request IDs).
- the task ID is used to uniquely identify a task.
- the task ID of each task is different.
- the input data is the data to be calculated for the task, which can be identified according to the task identifier.
- the data including the task identifier of the task is the input data.
- the result of executing the task based on the input data is the output data of the task, and the task identifier of the output data is the task identifier of the next task.
- the data packet carrying the output data also carries the task identifier of the next task.
- the same task can be executed by one or more operators.
- the arrangement order of the operators in the task indicates the execution sequence of the operators.
- the execution device of the task executes the corresponding operation according to the order of the operators. There is no need to use the task identifier to transmit the execution result of the operator between different operators. For the execution result of the current operator, the execution device can directly use the next operator. Process the execution result. For example, in task 1, the tablecan operator is followed by the filter operator.
- the execution device of task 1 first reads Table 1, and then uses the filter operator to filter Table 1 based on the filter conditions of the filter operator.
- the execution result of the last operator in the task is the output data of the task.
- Step 603 the central node determines an offloadable task in the one or more tasks, and determines an execution device of the offloadable task.
- the central node In order to realize parallel processing, subsequently, the central node generates a task scheduling plan, allocates the above tasks to multiple execution devices, and executes one or more tasks in parallel by the multiple execution devices. For example, assign task 1 to multiple worker nodes, and each worker node reads one or more shards of Table 1, where sharding refers to dividing the data to be queried into shards of equal size, for example , Table 1 includes 10,000 rows, and every 2,000 rows is sequentially divided into one shard, then Table 1 can be divided into 5 shards. This is called parallel processing to improve the efficiency of task execution.
- sharding refers to dividing the data to be queried into shards of equal size
- Table 1 includes 10,000 rows, and every 2,000 rows is sequentially divided into one shard, then Table 1 can be divided into 5 shards. This is called parallel processing to improve the efficiency of task execution.
- tasks include offloadable tasks and non-unloadable tasks, and different types of tasks may have different execution devices.
- tasks that cannot be offloaded can be processed by a worker node, and offloadable tasks can be offloaded to a network device for processing.
- the network device is a network card of the worker node, which reduces the workload of the worker node and also reduces the CPU of the worker node. The amount of computation and CPU burden.
- the offloadable task in this embodiment may be a task including an offloadable operator, and the offloadable operator may be preset or predetermined by a protocol.
- the unloadable operators include but are not limited to: tablescan operator, filter operator, jion operator, aggregation operator, TopN operator, distinct operator, and the like. It should be noted that the above-mentioned offloadable operators are only examples, and the embodiment of the present application does not limit the type or quantity of the offloadable operators.
- a task contains multiple operators, and some of the operators are not offloadable operators, the task can be defined as a non-offloadable task. In practical applications, tasks that need to be performed using non-detachable operators can be defined as a separate task. That is to say, the operators involved in the offloadable task in this embodiment are all offloadable operators.
- task 3 includes offloadable operators. Therefore, task 1 is an offloadable task, and the execution device of task 1 may be a network card of a worker node.
- the operator used by task 4 is a non-unloadable operator, so task 4 is a non-unloadable task, and the execution device of task 4 can be the worker node itself.
- the central node obtains the storage information of Table 1 from the data source, for example, on which storage nodes Table 1 is stored, and the table stored in each storage node.
- the central node generates the fragmentation information in Table 1 based on the storage information, and the fragmentation information of each fragment includes information such as the IP address and storage location of the storage node where the fragment is located.
- table 1 contains 10,000 rows
- rows 1 to 4,000 in table 1 are stored on storage node 1
- rows 4,001 to 8,000 are stored on storage node 2
- rows 8,001 to 10,000 are stored on storage node 2.
- Table 1 can be divided into 5 shards, for example, shard 1 to shard 5, respectively.
- the shard information of shard 1 includes but is not limited to the following parts Or all: the identification of shard 1, the IP address of storage node 1, the storage location (rows 1 to 2000 of Table 1 are stored in the address space of storage node 1, for example, can be expressed as the first address of the address space and the length of lines 1 to 2000); the shard information of shard 2 includes, but is not limited to, some or all of the following: the identifier of shard 2, the IP address of storage node 1, the storage location (section 1 of Table 1).
- Lines 2001 to 4000 are stored in the address space of storage node 1, for example, it can be expressed as the first address of the address space and the length of lines 2000 to 4000), and so on, and will not be described here one by one
- Table 1 may also be stored on only one storage node, which is not limited in this embodiment of the present application.
- the central node generates a task scheduling plan based on the information of multiple tasks (or the above-mentioned execution plan) included in the SQL, the information of the worker nodes, the fragmentation information, and the like.
- the task information includes a task identifier, or the task information includes a task identifier, an uninstall flag, and the like.
- the uninstall flag is used to indicate whether the task corresponding to the task identifier carried in the first setting instruction is an uninstallable task.
- the uninstall flag may be 1 bit.
- a bit value of 1 in the 1 bit indicates that the task can be uninstalled , if it is 0, it means that the task cannot be uninstalled.
- the uninstallation flag is a fixed value, and the uninstallable task carries the uninstallation flag, and the non-uninstallable task does not carry the uninstallation flag.
- the information of the working node includes information such as the number of working nodes, addresses (such as IP addresses, ports), and the identification of the working node; wherein, the identification of the working node can be globally unique.
- the so-called global uniqueness refers to the working node indicated by it. It is unique in the query system, and each worker node as well as the central node knows the meaning of the identifier.
- the identifier may be the IP address, device identifier or device name of the worker node, or a unique identifier generated by the central node for each worker node in the query system, or the like.
- the information of the switch includes the address of the switch (for example, IP address, port, etc.), whether it has the ability to process offloadable tasks, the identifier of the switch, and other information.
- address of the switch for example, IP address, port, etc.
- identifier of the switch for example, IP address, port, etc.
- the task scheduling plan includes some or all of the following: a task identifier, an unload flag, an identifier of a worker node to which the task is assigned, fragmentation information corresponding to the task, and the like.
- a task identifier e.g., a task identifier
- an unload flag e.g., a flag
- an identifier of a worker node to which the task is assigned e.g., fragmentation information corresponding to the task, and the like.
- Table 3 is a specific example of a task scheduling plan provided by the embodiment of the present application for the above-mentioned SQL statement.
- all tasks are allocated to the worker nodes for processing as an example. Assuming that the assignment worker node 20a reads the complete table 2, subsequently, the worker node 20a broadcasts the table 2 to every other worker node.
- the central node assigns tasks 1 to 4 to worker nodes 20a to 20e respectively, and assigns worker node 20a to read shard 1, let worker node 20b read shard 2, and let Worker node 20c reads shard 3, lets worker node 20d read shard 4, and lets worker node 20e read shard 5.
- each worker node executes task 1 separately, it can read some rows of table 1 in parallel without interfering with each other, and execute subsequent tasks based on the read data until task 5 obtains partial aggregation results, and finally the execution task
- the node 6 summarizes the partial aggregation results of the worker nodes 20a to 20e to obtain the final query result.
- the user can also check the execution progress of the query request at any time.
- the specific execution of the current task can be displayed.
- Information such as which nodes the task is assigned to, whether it is an offloadable task, the execution device of the offloadable task, the information of the node or execution device (assuming that the IP addresses of the worker nodes 20a to 20e are respectively 76.75.70.14-18 ), execution status (for example, including not started execution, in execution, execution completed), etc.
- the interfaces shown in FIG. 7 and FIG. 8 are only a schematic representation, and may also be displayed in other manners, which are not limited in this embodiment of the present application.
- Step 604 the central node sends a first setting instruction of the offloadable task to the working node, and the network card of the working node is set as the execution device of the offloadable task.
- the central node sends the first setting instructions of each task to the working nodes set to process the task respectively.
- the central node may use the task as a granularity, generate a first setting instruction for each worker node based on the task scheduling plan, and send the first setting instruction of the offloadable task to the worker node.
- the working node receives the first setting instruction sent by the central node.
- the first setting instruction includes, but is not limited to, some or all of the following information: a task identifier, an uninstall flag, and operator information. As shown in Table 4 below, a format example of a first setting instruction provided in this embodiment is provided.
- the operator information includes but is not limited to some or all of the following:
- Operator identification, operator execution rules, operator input information, and operator output information are used to indicate the input data required to perform the task, and the information of the node where the input data is located, such as address information, storage information, table name, etc.; for example, the fragmentation information in Table 1 above, Input information for task 1.
- the output information of the operator including the task identifier of the next task of the task corresponding to the Request ID, the information of the next-level node, etc.
- the first setting instruction of the non-uninstallable task may be similar to the first setting instruction of the uninstallable task, the difference is that the uninstallation mark of the uninstallable task indicates that the task is an uninstallable task, and the uninstallation mark of the non-uninstallable task is Indicates that the task is a non-unloadable task, or only the first setting instruction of the unloadable task carries the uninstall flag, and the first setting instruction of the non-unloadable task does not carry the uninstall flag, so as to distinguish the two.
- Table 5 lists specific examples of the first setting instructions of each task sent to the worker node 20a, wherein the execution rules of each operator refer to the above introduction, and are not repeated in Table 5. Repeat the description, and assume that Flags is 1 to indicate that the task can be uninstalled, and Flags to 0 indicates that the task cannot be uninstalled.
- the format of the above-mentioned first setting instruction is only an example.
- the first setting instruction may contain more or less information than Table 5, which is not limited in this embodiment of the present application.
- the central node may also determine whether the task is an offloadable task, which is determined by each worker node according to a preset operator, and correspondingly, the first setting instruction may not include the offload flag.
- the first configuration letter may further include padding data (Magic bytes), and the padding data may be data of known bits, such as 0 or 1, so that the length of the first setting command is a preset length.
- step 605 the worker node determines whether the received task is an offloadable task, and if so, executes step 606, otherwise, the task is processed by the worker node.
- the working node can judge whether the task is an unloadable task according to the unloading flag carried in the first setting instruction, and if so, then The information about the offloadable task is set in the network card, and the offloadable task is subsequently processed by the network card.
- the worker node may also distinguish between offloadable tasks and non-offloadable tasks according to whether they carry the offload flag or not.
- the central node identifying the uninstallable task. If the central node does not recognize the uninstallable task as described above, no matter whether the uninstallable task or the non-uninstallable task, the first setting instruction will not carry the uninstallation flag.
- the worker node may identify the offloadable task according to the preset offloadable operator, which is not limited in this embodiment of the present application.
- Step 606 the worker node offloads the offloadable task to the network card of the node, that is, sets the information of the offloadable task in the network card.
- the working node when it sets the information of the offloadable task in the network card, it can send the second setting instruction of the task to the network card, and the network card obtains and records the information of the offloadable task according to the second setting instruction.
- the second setting instruction may include a header and a data portion, wherein the header may include a control instruction and a task identifier, and the data portion may include operator information of the task.
- the header may include a control instruction and a task identifier
- the data portion may include operator information of the task.
- Table 6 a format of a second setting instruction provided in this embodiment.
- Command is used to indicate the type of command, or to indicate what operation to perform.
- Command can be but not limited to the following types: uninstall command (init command), read command (read command), end command (end command).
- the uninstall command is used to indicate the task corresponding to the uninstall request ID.
- the execution command is used to instruct to start the tablescan task and read the input data of the task corresponding to the Request ID.
- reading the data to be queried is the starting point of executing SQL, so this command can be called a read command or an execution command.
- the execution device can be instructed to release the resources used to process the task corresponding to the Request ID through the end command, or it can be understood as indicating that the task has ended and the resources allocated to the task can be released.
- the second setting instruction where command is the init command is called the unloading command
- the second setting instruction where the command is the read command is called the execution instruction
- the second setting instruction where the command is the end command is called the end command.
- the payload includes the operator information of the task. The operator information has been introduced before, and the description will not be repeated here.
- the information flow for setting the offloadable task on the network card of the worker node can include:
- the worker node 20a After determining that task 1 is an offloadable task, the worker node 20a sends an offload command of task 1 to the network card of the worker node 20a, as shown in Table 7 below.
- the network card after receiving the uninstall command, the network card first checks the packet header of the uninstall command, and checks the packet header. If it is an init command, the network card determines that task 1 (Request ID is 1) is an uninstallable task, and assigns (or Say reserved) NIC resource, which is configured to handle task 1.
- Task ID is 1
- NIC resource assigns (or Say reserved) NIC resource, which is configured to handle task 1.
- network card resources are described below. As shown in FIG. 9 , in this embodiment, network card resources (for example, including computing units) and memory resources used to process offloadable tasks may be divided into multiple copies, each of which may be called a processing engine (PE), A PE can be configured to handle an offloadable task.
- PE processing engine
- setting the information of the offloadable task in the network card may include the following process: after the network card receives the uninstall command of task 1, if there is an idle PE, the network card assigns task 1 to an idle PE. On the PE, the PE records the Request ID and operator information of task 1. Correspondingly, the network card records the first correspondence between the PE and the offloadable task handled by the PE to record which task the PE is assigned to perform. Specifically, the first correspondence includes the PE identifier and the Request ID of the offloadable task.
- the network card can determine the PE corresponding to the Request ID according to the first correspondence, and route the data packet to the corresponding PE for processing. For example, when task 1 is executed, the network card of the worker node 20a sends a read request for reading fragment 1 to the storage node corresponding to fragment 1. The read request carries the Request ID of task 1, and the storage node sends task 1 to the network card. and these feedback data packets also carry the Request ID of task 1, so that the network card can determine the PE corresponding to the data packet returned by the storage node according to the first correspondence, and route the data packet to the determined On the PE, the PE uses the corresponding operator to process the data packet according to the recorded operator information.
- a circular queue may also be set in the network card.
- the number of offloadable tasks that can be placed in the circular queue may be equal to the number of PEs.
- the offloadable task comes in and the circular queue is not full, the offloadable task is put into the circular queue, and an idle PE is allocated for the offloadable task; when the circular queue is full, the network card sends the offload command to the device that sent the command.
- a response is sent, where the response is used to indicate that the network card cannot process the offloadable task, and may also include the reason for the inability to process, for example, the network card does not have the resources to process the offloadable task.
- the circular queue is full, and when the network card determines that there is no idle PE to process the task, it sends a response to the processor of the worker node 20a, indicating that the network card cannot After the task 1 is executed, the worker node 20a can execute the task 1 subsequently, so as to reduce the time delay and improve the processing speed of the task.
- the network card places all the unloadable tasks received in the circular queue, and all the unloadable tasks can be placed in the circular queue. If the number of unloadable tasks is more than the number of PEs, Then, when there is an idle PE, an offloadable task that has not yet allocated a PE is selected in the circular queue to allocate the idle PE.
- Step 607a the central node sends an execution instruction to the network card of the working node, and correspondingly, the network card of the working node receives the execution instruction.
- certain tasks can be executed only after receiving execution instructions, for example, tasks that need to use the tablescan operator, that is, task 1 and task 2 in the above example.
- the central node can send an execution instruction of the task to trigger the execution device to execute the task.
- the execution instruction here may be the above-mentioned read command, and the central node sends the read commands of task 1 and task 2 to the working nodes.
- a read command can carry the Request IDs of multiple tasks.
- the Request ID of the read command carries the Request ID of task 1 and the Request ID of task 2. That is, the read commands of task 2 and task 1 can be the same.
- the read command of each task is independent.
- the read command of task 1 only carries the Request ID of task 1.
- the read command of task 2 only carries the Request ID of task 2. This embodiment of the present application does not limit this.
- Table 8 is only an example. If the unload command of task 1 carries the fragment information of task 1, the read command of task 1 may not carry the fragment information of task 1 repeatedly, so as to reduce the need for transmission. The amount of data to avoid repeated transmission, save network resources. Alternatively, regardless of whether the unload command contains fragmentation information, the read command can carry fragmentation information, and the execution device takes the fragmentation information in the read command as the criterion, so as to dynamically and flexibly adjust the fragmentation information and improve data hit rate.
- the work node after the work node sets the task completion information according to the first setting instruction, it can send a completion response to the central node, and the central node sends the execution instruction after receiving the completion response.
- the central node may also directly send an execution instruction, and after the working node sets the task completion information according to the first setting instruction, it directly starts to execute the corresponding task, so as to realize the automatic execution of subsequent tasks.
- Step 607b the network card of the working node receives the data of the working node or other nodes.
- other nodes here may be other working nodes, central nodes, storage nodes or forwarding devices other than this node.
- the network card receives the results of task 5 performed by other worker nodes.
- the step 607b is an optional step, not a mandatory step, and the step 607b and the step 607a are not strictly limited in timing.
- Step 608 the network card determines whether the received data is unloaded to the task of the network card, and if so, executes step 609; otherwise, executes step 610.
- the network card monitors whether the received data is the input data of the offloadable task that is set to be processed by itself. If so, the data is calculated, otherwise, the data is forwarded to the working node.
- the data here includes various setting instructions and data to be calculated.
- the network card determines whether the received execution instruction is a task unloaded to the local network card, if so, the network card starts to execute the corresponding task; otherwise, the network card sends the execution instruction to the working node, Started by the worker node to execute the corresponding task.
- the execution device of each task monitors the received data and judges whether the data is the data of the task unloaded to the network card. For example, if the data contains the request ID of the task of the network card, the data is the task data. If the data is an execution instruction, the network card executes the execution instruction. If the data is the input data of the task, the network card uses the operator and execution rules of the task to process the data. If not, it is determined that the data is not for the task, and the data is sent to the worker node for processing.
- Step 609 the network card executes the corresponding task, and returns the result to the next-level node of the execution plan or the execution device of the next task.
- Step 610 the network card sends the data to the working node.
- Step 611 The worker node executes the corresponding task according to the data, and returns the result to the execution device or the next-level node that executes the next task of the plan.
- steps 608 to 611 may be cyclically executed steps until the final query result, that is, the result of task 6 is obtained.
- PE0 corresponds to task 1
- PE1 corresponds to task 2
- PE4 corresponds to task 6.
- Task 4 is processed by the worker node 20a, and the flow of the worker node 20a executing tasks 1 to 6 is described as follows:
- the network card of the working node 20a After receiving the execution instruction of the task 1, the network card of the working node 20a determines the PE corresponding to the task 1, that is, PE0, according to the above-mentioned first correspondence; routes the execution instruction of the task 1 to the PE0, and the PE0 executes the task 1: that is, according to the The shard information of task 1 (of shard 1) sends a read request to the corresponding storage node.
- the read request may be a read request in an existing implementation mechanism or a read request in other formats.
- PE0 can forward the read request of task 1 to the storage node corresponding to shard 1.
- the storage node After receiving the read command, the storage node returns the data of shard 1 corresponding to task 1 to the network card. As mentioned above, the storage node The returned data contains the same Request ID as the read request.
- the filtering result can be represented by the bitmap corresponding to the filtering column, the order of each bit in the bitmap corresponds to each row in the read slice, and the different bit values on the bits indicate whether the row satisfies the filtering condition. See Table 9 below, assuming that Table 9 is a part of data in shard 1 of Table 1.
- the output data For the output data of task 1, the output data carries the request ID of the next task of task 1, that is, request ID 3, and the filtered data of task 1.
- the network card continues to determine the PE corresponding to request ID3, that is, PE2, according to the first correspondence, and routes the output data of task 1 to PE2.
- the network card can send indication information to the working node, the indication information is used to indicate that the task execution is completed, and after the working node receives the indication information, the working node can send an end command to the network
- the network card releases the corresponding network card resources (eg PE), memory resources, etc. for processing the offloadable task.
- the worker node 20a sends the end command shown in Table 10 below to the network card.
- the network card receives the end command and releases the PE and memory resources for processing task 1.
- the freed resources can be used to handle other offloadable tasks that are offloaded to the NIC.
- the network card can also decide when to release the PE. For example, when the storage node sends the last data packet of task 1 to the network card, the last data packet carries an identifier indicating that the data packet is the last data packet. After the network card determines that the PE has processed the last data packet of the task 1, it releases the corresponding resources for processing the task 1. Similarities below will not be repeated.
- the central node sends the execution instruction of task 2 to the network card of the working node 20a, and the network card determines the PE corresponding to request ID2 according to the first correspondence, that is, PE1, PE1 can be based on the first setting of task 2 shown in Table 5.
- instruction obtain the complete table 2 according to the storage information of table 2 (including the IP address of the storage node, storage location and other information), and send the read table 2 to the working node 20b, the working node 20c, the working node 20d and the working node 20b respectively. Node 20e.
- table 2 is routed to the next task of task 2, that is, PE2 corresponding to task 3.
- PE2 executes task 3, that is, PE2 processes the output data of task 1 and task 2 based on the operator and execution rule corresponding to task 3, and obtains the output data of task 3.
- PE2 sends the output data of task 3 (including request ID 4) to the execution device of task 4. Since task 4 is a non-unloadable task, and the execution device of task 4 is the work node itself, the network card can transfer the output data of task 3 Sent to the processor of the worker node 20a for processing by the worker node 20a.
- the work node 20a sends the output data of the task 4 (including the request ID 5) to the execution device of the task 5. Specifically, the work node 20a sends the output data of the task 4 to the network card, and the network card determines the request ID according to the first correspondence The corresponding PE of 5, namely PE3, the network card routes the output data of task 4 to PE3, and so on, until the working node 20a obtains the final query result.
- work nodes such as work node 20b
- work nodes also need to send the output data packets of task 5 to the network card of work node 20a.
- PE4 executes task 6 to obtain the final query result.
- the last data packet can also carry an end identifier, and the end identifier is used to indicate whether the data packet is the last data packet of the current request ID. Whether the data transmission on the end worker node is completed.
- the format of the above-mentioned second setting instruction is only an example.
- the second setting instruction may contain more or less content than the above-mentioned examples, which are not limited in this embodiment of the present application.
- the second setting instruction may include padding data, so that the length of the second setting instruction is a preset length.
- the central node can also directly send the offload command of the offloadable task according to the task scheduling plan to each execution device set to execute the task, such as the network card or forwarding device of the worker node.
- the task scheduling plan such as the network card or forwarding device of the worker node.
- FIG. 10 is a schematic flowchart corresponding to another data query method provided in this embodiment.
- forwarding devices in the network such as switches and routers, can also be set to execute offload tasks. equipment.
- the way of setting the offloadable task in the network card of the working node is the same as that in the embodiment shown in FIG. 6 , that is, steps 1001 to 1006 and steps 1013 to 1016 in this embodiment can refer to steps 601 to 1016 in FIG. 6 respectively. 606.
- steps 608 to 611 will not be repeated here, and only the differences will be described below.
- the forwarding device is a switch as an example for description.
- This embodiment is also described by taking the SQL statement in the embodiment of FIG. 6 as an example.
- Table 11 it is another task scheduling plan generated for the SQL statement in the above step 602.
- the execution devices of the offloadable tasks in the task scheduling plan include switches.
- the execution device of 6 is the switch 40 .
- Task ID Offloadable tasks execution equipment store information task 1 Yes switch 30 NULL task 2 Yes
- the network card of the worker node 20a Storage Information of Table 2 task 3 Yes From the network card of the worker node 20a to the network card of the worker node 20e NULL task 4 no Worker Node 20a to Worker Node 20e NULL task 5 Yes From the network card of the worker node 20a to the network card of the worker node 20e NULL task 6 Yes switch 40 NULL
- Table 11 is only an example, and does not constitute a limitation on the task scheduling plan of the present application.
- step S1003 When the execution device of the offloadable task determined in step S1003 is a switch, please refer to the description of step 1007 and step 1008 for the setting of the offloadable task by the central node.
- Step 1007 The central node sends a setting instruction of the offloadable task to the switch, and the switch is set as an execution device of the offloadable task.
- the switch includes at least two ports, namely the data port and the control port, when the switch receives the data packet through the data port, it means that the The data packet is a data packet that needs to be forwarded, and the switch forwards it to the device corresponding to the destination IP address according to the destination IP address of the data packet. If the switch receives the data packet through the control port, it means that the data packet is a data packet configured by the central node to the switch, and the switch needs to be configured according to the data packet, such as the task setting instruction sent by the central node to the switch.
- the central node will send the setting instruction of task 1 to the switch 30; the central node will send the setting instruction of task 6 to the switch 40.
- the central node may send the first setting instruction of the offloadable task to the control port of the switch, so as to instruct the switch to set the information of the offloadable task according to the first setting instruction.
- the central node may send the second setting instruction (unloading command) of the offloadable task to the control port of the switch, so as to instruct the switch to set the information of the offloadable task according to the offloading command.
- the following description takes the setting command as the uninstall command as an example.
- Step 1008 The switch sets the information of offloadable tasks.
- the switch 30 when the switch 30 receives the unloading command of task 1 through its own control port, it records the information of task 1 (including operator information of task 1 and request ID 1, etc.) according to the unloading command.
- the switch 40 receives the unloading command of task 6 through the control port, it records the information of task 6 (including operator information of task 6 and request ID 6, etc.) and the like according to the unloading command.
- the switch that is set to process the offloadable task monitors whether the Request ID of each received data packet is for offloading to the local task, and if so, the switch processes the data; otherwise, the switch forwards the data to The destination IP address of the data.
- Step 1009a the central node sends execution instructions of task 1 and task 2.
- the execution instruction here can be the startup command above.
- the data packets sent by the central node, storage node or working node are first routed to the switches in their corresponding network segments. That is, the execution command sent by the central node will also be routed to the switch first.
- Step 1009b the switch receives data from other nodes.
- Other nodes can be worker nodes, central nodes, storage nodes, or other forwarding devices.
- Step 1010 the switch judges whether the received data is a task to be offloaded to the switch, and if so, executes step 1011, otherwise, executes step 1012a.
- the data received by the switch includes setting instructions, execution instructions, fragmented data in Table 1 or Table 2 sent by storage nodes, or output data obtained by other working nodes performing tasks.
- step 1010 specific operations may be performed with reference to the network card in step 608, which will not be repeated here.
- the switch will first receive the execution instructions of task 1 and task 2 sent by the central node. The switch determines whether task 1 and task 2 are tasks that are unloaded to the switch. If not, the execution instructions of task 1 and task 2 are They are respectively forwarded to the device corresponding to the destination IP address.
- step 1010 may also be a cyclically executed step until the query system obtains the final query result.
- Step 1011 the switch executes the task, and returns the execution result of the task to the next-level node in the execution plan.
- Taking task 1 as an example first refer to the following configuration: (worker node 20a, shard 1), (worker node 20b, shard 2), (worker node 20c, shard 3), (worker node 20d, shard 4) , (worker node 20e, shard 5).
- the central node sends the first execution instruction of task 1 to the working node 20a respectively, and instructs the working node 20a to read the data of slice 1 through the first execution instruction; similarly, sends the second execution instruction of task 1 to the working node 20b. Execute the instruction, instruct the working node 20b to read the data of the slice 2 through the second execution instruction, and so on.
- the working node 20a receives the first execution instruction, and sends a read request (including request ID 1) to the storage node corresponding to the shard 1 according to the shard information of the shard 1. Please understand in conjunction with FIG. 4 that this The transmission path of the read request is worker node 20a ⁇ switch 10 ⁇ switch 40 ⁇ switch 30 ⁇ storage node.
- the storage node responds to the read request and sends the feedback data packet of the fragment 1 (including the request ID 1), and the destination IP address of the feedback data packet is the working node 20a.
- the data packet is first routed to the switch 30, and the switch 30 detects whether the feedback data packet is the data of the task unloaded to the switch 30, that is, the data of the task 1. If so, the switch 30 based on The feedback data packet performs task 1, otherwise, the switch 30 sends the data packet to the destination IP address corresponding to the data packet.
- the switch 30 determines that the data packet is the data packet of task 1, and then, according to the operator information of task 1, based on the execution rules of the filter operator, the filter operator is used to calculate the data packet.
- the child performs a filtering operation on the data in the data packet to obtain the filtering result, that is, the output data of task 1.
- the switch 30 encapsulates the output data in the output data packet.
- the output data packet carries the data of task 1.
- the request ID of the next task namely request ID3
- the destination IP address carried in the data packet of fragment 1 received from the storage node the output data packet is sent to the destination IP address, that is, the worker node 20a.
- the interaction between the switch 30 and any other working node will not be repeated here.
- the above method of setting operator information by executing instructions is only an example.
- the corresponding relationship between each working node and fragmentation information can also be sent to the switch 30 through the setting instruction of task 1, such as an unloading command,
- the switch 30 distributes the filtering results of each fragment in Table 1 to achieve the same effect as the above example.
- Step 1012a the switch forwards the data to the network card of the corresponding working node.
- Step 1012b the network card of the working node receives the data of the working node.
- step 1012b is an optional step, not a mandatory step, and the step 1012b and the step 1012a are not strictly limited in timing.
- Subsequent steps S1013 to S1016 are the same as steps S608 to S611 in FIG. 6 , and are not repeated here.
- the switch 40 that executes the task 6 is the next level node from the working node 20a to the working node 20e, and the working node 20a to the working node 20e respectively send the output data of the respective task 5 (carrying the request ID 6) to the switch 40, if the switch 40 After receiving the data, it is determined that task 6 needs to be executed on the data, then the data is processed according to the operator and execution rule corresponding to task 6 to obtain a final query result. Afterwards, the switch 40 sends the obtained query result to the central node, and the central node returns the query result to the client.
- the offloadable task is offloaded to the network device for processing, which reduces the burden on the processor of the working node, and further reduces the amount of data transmitted on the network.
- offloading the processing to the network card or forwarding device of the worker node can reduce the burden on the processor of the worker node and reduce the amount of data transmitted on the network.
- offloading the task The processing to the network card or switch may affect the execution efficiency, so in another embodiment of the present invention, for the uninstallation task preset in the central node, an uninstallation policy can be further set, and according to the set uninstallation policy, it is judged whether the uninstallation task can be Uninstallation is performed, and only tasks that satisfy the uninstallation policy are set as uninstallable tasks.
- the data source can analyze the stored data table to obtain the data distribution information of the data table, wherein the data distribution information includes the total amount of data n of a certain column of data in the data table, and is used to indicate that a certain column of data in the data table is in different intervals.
- the data distribution of the age column in the table can be the number of persons in the age range of 1-10 (recorded as quantity a), and the age range of 11-20.
- the number of personnel (denoted as number b), the number of personnel in the age group of 21-30 (denoted as number c).
- the central node may request from the data source to obtain the data distribution information of the data table to be queried, and correspondingly, the storage node sends the data distribution information requested by the central node to the central node. Based on the data distribution information, the central node can roughly calculate the selectivity of the filter operator, the cardinality of the aggregation column of the aggregation operator, etc. For example, if the execution rule of a filter operator is for people aged 1-30, the The selectivity of the filter operator is (a+b+c)/n. Of course, this data distribution is a rough statistic, and the selectivity or the cardinality of the aggregated columns obtained from this is not an accurate selectivity. The related content will not be repeated below.
- the corresponding unloading policy set for the filter operator may be: when the selection rate of the filter operator is low, for example, lower than the first preset value, the filter operator The task to which the child belongs is an offloadable task.
- the central node can offload the task to the network card or switch of the storage node for processing, which can reduce the amount of data transmitted on the network. For example, when the storage node and worker node are deployed on different devices, the central node can also offload task 1 to the network card of the storage node for processing.
- the storage node only needs to send 10 rows of data to the worker node, and does not need to read the data. All Table 1 is sent to the working node, so that the amount of data sent by the storage node to the working node is relatively small, that is, the amount of data transmitted in the network is small, so as to reduce the burden on the CPU, it can also avoid occupying a lot of network bandwidth.
- task 1 can also be offloaded to the network card of the worker node for processing, and the network card of the worker node performs filtering.
- the CPU of the worker node is not required to perform the task, thereby reducing the burden on the CPU.
- the network card does not need to send data to the CPU. A large amount of data also reduces the data interaction within the worker nodes.
- the corresponding offloading policy set for the aggregation operator can be: if the cardinality of the columns to be aggregated is relatively small, for example, it does not exceed the second preset value, then determine The task to which the aggregation operator belongs is an offloadable task.
- the sale column is the column that needs to be aggregated. If commodity A has 10 rows of sales records, the sale values of the 10 rows need to be aggregated, then the 10 rows It can be understood as the cardinality of the columns that need to be aggregated. For example, if the second preset value is 100, it can be determined that the task executed by using the aggregation operator is an offloadable task.
- the NIC of the worker node or to the switch for processing For offloadable tasks of the aggregation operator type, it can be offloaded to the NIC of the worker node or to the switch for processing. For example, when the cardinality of the columns that need to be aggregated in the aggregation task is relatively small, it can be offloaded to the switch for processing. Occupying more computing resources of the switch, it can also reduce the amount of data transmitted in the network.
- the unloading strategy of the distinct operator is similar to the unloading strategy of the aggregation operator. Specifically, the unloading strategy of the distinct operator is based on the cardinality of the distinct column. For example, the cardinality of the distinct column does not exceed the third preset value.
- the distinct operator The task it belongs to is an offloadable task. The difference between the unloading strategy of the distinct operator and the aggregation operator is that the aggregation operator performs grouping based on the group by column, and then performs operations such as sum, min, max, conut, or Avg on the grouped data. The distinct operator only needs to group the distinct column.
- offloadable tasks of the distinct operator type they can be offloaded to the NIC of the worker node, or can be offloaded to the switch for processing.
- the dynamic filter operator refers to the filtering and filtering of the large table by the small table when the two tables are combined in the jion operator.
- the following takes task 3 as an example, and combines the two scenarios shown in FIG. 4 and FIG. 5 to introduce the dynamic filtering operator execution flow of the Jion operator of task 3 in these two scenarios.
- the central node allocates a shard for reading Table 1 to the worker nodes.
- each worker node needs to know the value of the ID column in each row of Table 2, so the central node can not divide Table 2 into different shards, and let each worker node read the complete table separately. 2.
- the central node can also divide Table 2 into different shards, and let each worker node read one or more shards of Table 2. After that, each worker node can The ID column in the slice is sent to the central node, and the central node synthesizes the ID column returned by each working node to obtain the complete ID column in Table 2.
- Scenario 1 The scenario in which the storage node and the worker node are integrated, that is, the storage node and the worker node are deployed on the same physical node, as shown in Figure 5.
- the dynamic filter operator is more suitable to be offloaded to the network card of the worker node, that is, the network card of the worker node is the execution device of the dynamic filter operator.
- the process of using the Jion operator to combine Table 1 and Table 2 includes the following steps:
- Each worker node assigned to task 2 reads a shard in Table 2 respectively, and creates BF on the column on the on condition of the data of the shard, and obtains the BF column and the obtained BF column.
- the BF column may contain duplicate ID values, or the ID column may be deduplicated by using the distinct operator, which is not limited in this embodiment of the present application.
- each of the worker nodes 20a to 20e reads one.
- the specific process may be that the worker node 20a reads the shards shard 11, and create a BF in the column on the On condition of the read data of shard 11, and send the obtained BF to the central node 100; the worker node 20b reads shard 12, and in the read shard 12 The column on the On condition of the data of the slice 12 creates a BF, and sends the obtained BF to the central node 100; the worker node 20c reads the slice 13, and the column on the On condition of the read data of the slice 13 Create a BF, and send the obtained BF to the central node 100; and so on.
- the central node merges the BFs sent by each working node, and the ID column of the complete table 2 is obtained after the merge. Specifically, the central node 100 receives the BF columns of the shards of each table 2 from the working node 20a to the working node 20e, and merges the BF columns of the shards to obtain the BF column of the complete table 2, that is, the ID column.
- the central node sends the obtained complete ID column of Table 2 to the network card of the working node.
- the central node can send the BF column of complete table 2 to each execution device of task 3, for example, the execution devices of task 3 are worker nodes 20a to 20e NIC, the central node can send the complete ID column of Table 2 to the NIC of working node 20a, 20b, 20c, 20d, 20e.
- the central node 100 may send the BF column of the complete Table 2 to the one or more switches.
- the description will be given by taking the execution device of task 3 as a network card as an example.
- step 4) and step 1) to step 3) are parallel tasks, and the steps here do not represent a timing relationship.
- Scenario 2 A scenario in which storage nodes and working nodes are deployed separately, that is, storage nodes and working nodes are deployed on different physical nodes, as shown in Figure 4.
- the switch can filter the passing data to reduce the amount of data transmitted on the network, that is, the switch The execution device for this dynamic filter operator.
- the storage node sends the data of task 2 read by the working node to the switch 30, and filters the data of task 1 read by the working node and sends it to the switch 30;
- the worker node uses the data filtered by the switch 30 to perform a Jion.
- the above steps are the dynamic filtering process of the jion operator.
- the dynamic filtering in the jion operator can be offloaded to the network card of the worker node, or offloaded from the switch that passes through the interaction between the central node and the worker node.
- the offload strategy is to determine the selectivity when filtering large tables. .
- the central node may also determine execution according to the preset priority of each device corresponding to the offloadable task
- the execution device may be the device with the highest priority among the devices corresponding to the offloadable task.
- the corresponding devices are prioritized as follows: NIC of the storage node, switch of the rack where the storage node is located, switch of the rack where the working node is located, core switch, NIC of the working node, etc. Then, based on this ranking, it can be determined that the network card of the storage node can be the execution device of the task 1 .
- the above method of determining the execution device is only an example, and it can also be determined in combination with the priority and the load status of the device.
- the network card of the storage node has the highest priority, if the When the performance is relatively low or the load is relatively high, task 1 may not be offloaded to the network card of the storage node, and whether the next device can be used as the execution device is determined in order according to the priority.
- the processing method of offloading the offloadable task to the network card of the storage node please refer to the specific process introduction of the worker node offloading the offloadable task to the local network card of the worker node above, and the description will not be repeated below.
- the worker node can also selectively offload some offloadable tasks to the network card based on the preset offloading strategy.
- the preset offloading strategy on the worker node can be formulated based on the load balancing principle, etc. No emphasis is made here.
- the embodiment of the present application further provides a device for executing the function performed by the central node in FIG. 6 or FIG. 10 in the above method embodiment.
- the device includes generating unit 1101 , processing unit 1102 and communication unit 1103 .
- the generating unit 1101 is configured to receive the query request sent by the client, and parse the query request input by the user into multiple tasks. For a specific implementation manner, please refer to the descriptions of steps 601 and 602 in FIG. 6 and steps 1001 and 1002 in FIG. 10 , which will not be repeated here.
- the processing unit 1102 is configured to determine an offloadable task among the multiple tasks, and determine an execution device of the offloadable task.
- the network device may be a network card of a working node or a forwarding device, and the forwarding device includes a switch and a router. Please refer to the description of step 603 in FIG. 6 and step 1003 in FIG. 10 for the specific method of determining the offloadable task and the execution device of the offloadable task, which will not be repeated here.
- the communication unit 1103 is also used to send the setting instruction of each task, so as to set the task executed by it on each execution device.
- the specific implementation can refer to the relevant description of step S604.
- FIG. 10 For example, please refer to the related descriptions of steps S1004 and S1007.
- the communication unit 1103 can send the execution command of the query request after setting the corresponding task on each execution device by sending the setting command of each task, and the execution command is used to trigger the execution device to execute the set task. Please refer to the descriptions in step S607a in FIG. 6 and step 1009a in FIG. 10 , which will not be repeated here.
- the embodiment of the present application further provides a network device for performing the functions performed by the network device (switch, router, or network card of the working node) in FIG. 6 or FIG. 10 in the above method embodiment, As shown in FIG. 12 , the device includes a communication unit 1201 and a processing unit 1202 .
- the communication unit 1201 is configured to receive a setting instruction sent by the central node, where the setting instruction is used to set the network device to perform a task when performing a query request on the network device.
- the network device is a network card
- the network device is a forwarding device
- the network device is a forwarding device
- step 1004 in FIG. 10 will not be repeated this time.
- the processing unit 1202 is configured to set the task according to the setting instruction, and execute the task on the data flowing through the network device.
- the network device is a network card
- the relevant descriptions of steps S608 and 609 for specific implementation
- the relevant descriptions of steps S1013 and 1014 for the embodiment of FIG. 10
- the network device is a forwarding device, for the embodiment of FIG. 10
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Computer And Data Communications (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请提供一种使用网络设备进行数据查询的系统、方法、及装置。所述系统的中心节点通过网络设备连接至工作节点。所述网络设备例如为网卡、交换机、路由器等。中心节点将用户输入的查询请求生成多个任务。在为所述多个任务分配执行设备的时候,中心节点会将一些任务的执行设备配置为网络设备,一些任务的执行设备配置为工作节点,然后发送配置指令至所配置的网络设备及工作节点,以在网络设备及工作节点上设置为其配置的任务。配置完成后,由于工作节点与中心节点之间传输的数据都要经过网络设备,所以网络设备会对经过的数据执行所设置的任务,从而以减少工作节点的运算量,减轻工作节点上处理器的负担并加速数据处理。
Description
本申请涉及计算机技术领域,尤其涉及一种使用网络设备进行数据查询的系统、方法、及装置。
如今的信息时代,随着计算机、信息技术的迅速发展,产生的数据量也迅速增长,数据库中存储的数据量大到数百TB(Terabyte,太字节)(1TB=1024GB)甚至数十至数百PB(Petabyte,千万亿字节)(1PB=1024TB)。这些数据来源众多、数量巨大、形式各异,如何快速在数据库中查找到目标数据变得十分重要。
随着数据量的大量增长,为了保证数据的查询效率,当前的主要方式为增加处理数据的硬件资源,例如增加查询系统中各个节点的处理器(例如中央处理器(central processing unit,CPU)的处理能力及内存容量,但增加处理器的处理能力及内存容量导致产品的成本增加,另外,处理器的处理能力的增长空间有限,所以有时无法通过增强处理器的处理能力来提升查询效率。
发明内容
本申请提供一种数据查询方法及装置,用以在不增强CPU性能和/或内存容量的基础上,加速数据处理。
第一方面,本申请提供一种数据查询系统。所述系统包括中心节点、工作节点及网络设备,所述中心节点通过所述网络设备连接至所述工作节点。中心节点将用户输入的查询请求生成多个任务。在为所述多个任务分配执行设备的时候,中心节点会将一些任务的执行设备配置为网络设备,一些任务的执行设备配置为工作节点,然后发送配置指令以在网络设备及工作节点上配置相应的任务。在网络设备及工作节点设置了为其配置的任务之后,在数据流经所述网络设备时,即可对所述数据执行预先配置的任务。
通过上述设计,中心节点可以将一些任务配置给网络设备执行,这样,在数据流经网络设备时,由网络设备执行了预先配置的任务之后,再转发给其他执行设备,与相关技术中所有任务都由工作节点处理的方案相比,减少了工作节点的运算量,减轻工作节点上处理器的负担,从而可以在不增加工作节点的处理器的处理能力的基础上,加速数据处理。
在一种可能的实现方法中,在生成多个任务之后,所述中心节点用于在所述多个任务中查找可卸载任务,并将所述可卸载任务的执行设备设置为所述网络设备,所述可卸载任务为预设的卸载至所述网络设备执行的任务。
通过上述设计,可以预先配置好适宜卸载至网络设备的可卸载任务,可以方便的从所述多个任务中快速查找到可卸载任务。
在一种可能的实现方法中,所述中心节点用于将所述可卸载任务的设置指令发送至所述网络设备;所述网络设备用于根据所述设置指令设置所述可卸载任务。
在一种可能的实现方法中,所述网络设备为工作节点的网卡或者转发设备,转发设备可以是交换机、路由器。
在一种可能的实现方法中,所述转发设备包括数据端口及控制端口;所述中心节点用于将所述可卸载任务的设置指令通过转发设备的控制端口发送至转发设备;还用于将执行设备为网卡或者工作节点的任务的设置指令通过数据端口发送至转发设备;对应的,转发设备在通过控制端口接收到设置指令时,设置设置指令中所指示的可卸载任务;以及,当从数据端口接收到设置指令时,转发从数据端口接收到的设置指令。
通过上述设计,转发设备可以根据数据端口较快地区分需要转发的数据包,并转发给对应的设备,不需要对数据包进行解析,减少发送时延;以及根据控制端口区分中心节点发送的设置指令,避免误转发漏配置。
在一种可能的实现方法中,当执行可卸载任务的网络设备为工作节点中的网卡时,中心节点用于将可卸载任务的设置指令发送至所述工作节点,由工作节点根据设置指令在工作节点的网卡上设置所述可卸载任务。
通过上述设计,由工作节点根据设置指令在网卡上设置可卸载任务,当工作节点上集成有卸载策略时,还可以根据网卡的实际负载状况等确定是否卸载至网卡,这样工作节点控制网卡执行可卸载任务的方式更加灵活。
在一种可能的实现方法中,可卸载任务的设置指令包括可卸载标记;工作节点在接收到设置指令后,在确定设置指令中包括可卸载标记时,在工作节点的网卡上设置可卸载任务。
在一种可能的实现方法中,网络设备在接收到数据包之后,当确定数据包中包括该网络设备所执行的可卸载任务的标识时,对该数据包执行所述可卸载任务。
通过上述设计,网络设备可以根据可卸载任务的标识监控可卸载任务的数据包,并对该数据包执行可卸载任务,这样不需要单独的执行指令,可以快速准确的识别网络设备执行的可卸载任务,节省了开销,同时加速了数据处理。
在一种可能的实现方法中,中心节点还用于在确定所述可卸载任务之后,在确定可卸载任务符合所述可卸载任务对应的卸载策略时,发送所述可卸载任务的设置指令。
通过上述设计,可以进一步根据网络的环境等实际因素确定所述可卸载任务是否适合卸载至网络设备执行,从而进一步提高了数据查询的效率。
在一种可能的实现方法中,任务用于指示待执行的操作和操作数,操作数为被执行操作的数据;任务的设置指令可以包括任务标识和算子信息,其中,任务标识用于唯一标识一个查询请求中的一个任务,算子信息包括算子标识,算子标识唯一标识一种算子,一种操作可以运行一个或多个算子来完成,运行算子以对操作数执行任务所指示的操作。
在一种可能的实现方法中,可卸载任务是指完成该任务所需的算子均为可卸载算子;其中,可卸载算子可以是预设的;示例性地,可卸载算子包括:filter(过滤)算子、aggregation(聚合)算子、distinct(非空唯一)算子、TopN(前N个值)算子、Join(表联合)算子等;或可卸载算子为满足对应的可卸载策略的预设算子;示例性,预设算子和其对应的可卸载策略包括:filter算子,其对应的可卸载策略为在过滤列执行filter算子的选择率不低于预设阈值(例如第一预设值);aggregation算子,其对应的可卸载策略为在聚合列执行aggregation算子时,聚合列上被执行聚合的数据的基数不超过第二预设值;distinct算子,其对应的可卸载策略为需要执行去重的列上的数据的基数不超过第三预设值;其中,第一预设值、第二预设值或第三预设值可以是完全相同的值,也可以是不完全相同的值,或者是完全不同的值。
第二方面,本申请提供一种数据查询方法,该方法可以应用于中心节点,所述中心节点 通过网络设备连接至工作节点,该方法包括:中心节点将用户输入的查询请求生成多个任务。在为所述多个任务分配执行设备的时候,中心节点会将一些任务的执行设备配置为网络设备,一些任务的执行设备配置为工作节点,然后发送设置指令以在网络设备及工作节点上配置相应的任务。
在一种可能的实现方法中,当工作节点或网络设备设置完设置指令所指示的任务后,向中心节点发送反馈响应,用于指示已完成中心节点下发的任务的配置,随后,中心节点可以发送该查询请求的执行指令,该执行指令用于触发执行设备执行所设置的任务。
在一种可能的实现方法中,所述确定所述多个任务中的每个任务的执行设备包括:在生成多个任务之后,中心节点在所述多个任务中查找可卸载任务,并将所述可卸载任务的执行设备设置为网络设备,所述可卸载任务为预设的卸载至所述网络设备执行的任务。
在一种可能的实现方法中,当确定可卸载任务的执行设备为网络设备时,中心节点发送可卸载任务的设置指令至该网络设备。
在一种可能的实现方法中,该网络设备可以是工作节点的网卡或者转发设备,该转发设备可以是交换机或路由器。
在一种可能的实现方法中,当确定执行该可卸载任务的网络设备为工作节点的网卡时,中心节点将可卸载任务的设置指令发送至该工作节点,由工作节点控制在网卡中设置可卸载任务。
在一种可能的实现方法中,中心节点在确定执行可卸载任务的网络设备为转发设备时,发送所述可卸载任务的设置指令至转发设备。
在一种可能的实现方法中,设置指令中携带可卸载标记。
在一种可能的实现方法中,可卸载任务可以是预设的,中心节点在多个任务中确定出可卸载任务后,且确定该可卸载任务符合可卸载任务对应的卸载策略时,再发送该可卸载任务的设置指令至网络设备。
在一种可能的实现方法中,对于能够卸载到多个设备的可卸载任务,中心节点在确定该可卸载任务的执行设备时,还可以按照预设的可卸载任务对应的各设备的优先级来确定执行设备。
在一种可能的实现方法中,中心节点在确定该可卸载任务的执行设备时,还可以根据预设的可卸载任务对应的各设备的优先级和各设备的负载状况来确定执行设备。
关于该第二方面实现的有益效果,请参考第一方面关于中心节点执行方法的有益效果描述,在此不再赘述。
第三方面,本申请提供一种数据查询方法,可以应用于网络设备,该网络设备用于连接中心节点及工作节点,所述方法包括:网络设备接收中心节点发送的设置指令,并根据该设置指令设置对应的任务,并对流经该网络设备的数据包执行该任务。
在一种可能的实现方法中,网络设备可以为工作节点的网卡或者转发设备,例如转发设备为交换机或路由器。
在一种可能的实现方法中,转发设备包括数据端口及控制端口。转发设备可以从控制端口接收设置指令,通过控制端口接收到的数据为中心节点配置给转发设备的,转发设备根据控制端口接收到的设置指令设置可卸载任务;转发设备也可能从数据端口接收到设置指令,通过数据端口接收到的数据为中心节点配置给转发设备以外的其他设备的,转发设备转发从数据端口接收到的数据。
在一种可能的实现方法中,网络设备在接收到数据包之后,当确定数据包中包括网络设备所执行的可卸载任务的标识时,基于该数据包执行该可卸载任务。
关于该第三方面实现的有益效果,请参考第一方面关于网络设备执行方法的有益效果描述,在此不再赘述。
第四方面,本申请实施例还提供了一种数据查询界面,包括:查询命令输入区、任务显示区和执行设备显示区;
其中,查询命令输入区,用于接收用户输入的查询请求;
任务显示区,用于显示根据所述查询请求生成的执行所述查询请求的多个任务;
执行设备显示区,用于显示每个任务的执行设备,所述执行设备包括工作节点及网络设备。
在一种可能的实现方法中,所述查询命令输入区、任务显示区、及执行设备显示区在同一界面显示。
在一种可能的实现方法中,所述查询命令输入区、任务显示区、及执行设备显示区在不同界面显示。
第五方面,本申请实施例还提供了数据查询交互方法,该方法可以应用中心节点,中心节点为客户端的服务端,该方法包括:用户在客户端输入查询请求,客户端将该查询请求转发至中心节点,对应的,中心节点接收该查询请求,并基于该查询请求生成多个任务;更进一步地,中心节点生成该查询请求的执行计划,该执行计划包括每个任务的执行设备的信息,其中,中心节点可以将任务分配给工作节点执行,也可以将任务分配给网络设备执行,也就是,执行设备可以是工作节点或网络设备。中心节点可以在本地显示该多个任务的执行计划,也可以是中心节点将该执行计划发送给客户端,对应的,客户端在接收到该执行计划后,可以显示该执行计划,包括显示该多个任务以及每个任务的执行设备。
在一种可能的实现方法中,客户端上该多个任务按照执行计划以树状显示。
在一种可能的实现方法中,显示所述多个任务的执行进度。
通过上述设计,可以让用户直观的了解该查询请求的执行计划,以及查询进度,提高用户的参与度和使用体验。
第六方面,本申请实施例还提供了一种中心设备,该设备包括多个功能单元,这些功能单元可以执行第二方面的方法中各个步骤所执行的功能。这些功能单元可以通过硬件实现,也可以通过软件实现。在一个可能的设计中,该设备包括侦测单元以及处理单元。
第七方面,本申请实施例还提供了一种网络设备,该设备包括多个功能单元,这些功能单元可以执行第三方面的方法中各个步骤所执行的功能。这些功能单元可以通过硬件实现,也可以通过软件实现。在一个可能的设计中,该设备包括侦测单元以及处理单元。
第八方面,本申请实施例还提供了一种中心设备,该设备包括处理器、存储器和收发机,所述存储器中存储有程序指令,所述处理器运行所述存储器中的程序指令,通过收发机与其他设备通信,以实现第二方面所提供的方法。
第九方面,本申请实施例还提供了一种网络设备,该设备包括至少一个处理器和接口电路,所述处理器用于通过所述接口电路与其它装置通信,以实现第三方面所提供的方法。
其中,所述处理器可以为可编程门阵列(field programmable gate array,FPGA)、数据处理单元(data processing unit,DPU)、图形处理器(graphics processing unit,GPU)、特殊应用集成电路(application specific integrated circuit,ASIC)、系统级芯片(system on chip,SOC)。
第十方面,本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有指令, 当其在计算机上运行时,使得计算机执行上述第二方面所提供的方法或第三方面所提供的方法。
图1为本申请实施例提供的一种系统架构示意图;
图2为本申请实施例提供的一种查询系统架构示意图;
图3为本申请实施例提供的一种工作节点的内部结构示意图;
图4为本申请实施例提供的一种网络架构示意图;
图5为本申请实施例提供的另一种网络架构示意图;
图6为本申请实施例提供的一种数据查询方法所对应的示意图;
图7为本申请实施例提供的一种执行计划的界面示意图;
图8为本申请实施例提供的另一种执行计划的界面示意图;
图9为本申请实施例提供的一种网卡资源分配的示意图;
图10为本申请实施例提供的另一种数据查询方法所对应的流程示意图;
图11为本申请实施例提供的一种设备结构示意图;
图12为本申请实施例提供的一种网络设备的装置结构示意图。
为了使本申请实施例的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施例作进一步地详细描述。
本发明实施例描述的网络架构以及业务场景是为了更加清楚的说明本发明实施例的技术方案,并不构成对于本发明实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本发明实施例提供的技术方案对于类似的技术问题,同样适用。
参见图1,为本申请实施例可能适用的一种系统架构示意图。该系统包括客户端10,查询系统20和数据源30。
其中,客户端10是用户侧的一种计算设备,例如为台式计算机、笔记本电脑等。在硬件层面,客户端10中设置有处理器和内存(图1中未示出)。在软件层面,客户端10上运行有客户端程序。客户端程序用于接收由用户触发的查询请求,并且与查询系统20交互,例如向查询系统20发送该查询请求。对应的,查询系统20上运行有服务端程序,用于与客户端程序进行交互,例如接收客户端10发送的查询请求,查询系统20还用于从数据源30中获取查询请求所请求进行查询的原始数据,并对该原始数据进行计算或处理等以得到查询结果(或者说目标数据),后续,查询系统20将得到的查询结果返回给客户端10。
数据源30,可以是指数据库或者数据库服务器。在本实施例中,是指查询系统可以分析的数据源,例如MySQL数据源,Oracle数据源,HIVE数据源,其中数据的存储格式可以是HDFS(hadoop distributed file system,Hadoop分布式文件系统)文件,ORC(Optimized Row Columna,)文件,CSV(comma-separated values,逗号分隔值)文件,也可以是XML(eXtensible markup language,可扩展标记语言),Json(javascript object notation,对象简谱)等半结构化数据。当然,上述列举仅为示例,本申请实施例对数据源以及数据存储格式不做限定。数据源可以采用分布式存储,对应的在硬件层面上,数据源可以包括一个或多个存储节点,其中 存储节点可以是存储服务器、台式计算机或者存储阵列的控制器、硬盘等。
为了提高查询效率,查询系统可以采用大规模并行处理(massively parallel processing,MPP)架构,例如,Presto查询引擎,Presto查询引擎是一种开源的MPP SQL(structured query language,结构化查询语言)查询引擎,即分布式SQL查询引擎,用于查询分布在一个或多个不同数据源中的大数据集合,适用于交互式分析查询。具体的,MPP架构是指将任务并行的分散到多个服务器或节点上,在每个服务器或节点上并行执行任务,举例来说,某学生信息表包含学生的姓名、年龄、学号等信息,用户触发查询请求,请求查询该学生信息表中姓名为“小明”的学生,则在MPP架构的查询系统中,可以让多个节点分别基于该学生信息表中的部分行进行查询,这样便能够缩短查询时间,从而减少总查询耗时,提高查询效率。应理解,参与查询的节点越多,则同一个查询请求所需的查询时间越短。
如下以MPP架构为例,对本申请实施例提供的查询系统进行具体介绍。
请参考图2,本实施例中的查询系统主要包括:中心节点集群和工作节点集群。如图2所示,中心节点集群包括一个或多个中心节点(图2中仅示出两个中心节点100和101,但本申请对中心节点的数量不做限定)。工作节点集群包括一个或多个工作节点(图2中示出了三个工作节点20a、20b和20c,但本申请不限于三个工作节点)。
其中,中心节点,用于接收客户端发送的查询请求,并将接收到的查询请求解析为一个或多个任务,随后,将该一个或多个任务并行的下发到多个工作节点上,多个工作节点可以并行处理被分配的任务。应理解,中心节点可以将任务并行分配至查询系统中的部分工作节点,也可以分配至全部工作节点,另外,每个工作节点被分配的任务可以完全相同,也可以不完全相同,或者也可以完全不同,本申请实施例对此也不做限定。需要说明的是,中心节点可以是各个工作节点从工作节点中选举出一个节点让它承担中心节点的职能,也可以是特定的设备。另外,当查询系统中存在多个中心节点时,客户端所发送的一个查询请求会被路由到多个中心节点中的任意一个中心节点上。这样,查询系统中的多个中心节点可以同时响应多个查询请求,该多个查询请求可以是多个客户端发送的,也可以是一个客户端发送的。
工作节点,用于接收中心节点发送的任务,并执行任务。示例性地,执行的任务包括从数据源获取待查询的数据和对所获取的数据进行各种计算处理等。由于各个任务可以由各个工作节点并行处理,所以并行处理后的结果最终汇总后反馈给客户端。
继续参见图2,在硬件上,中心节点和工作节点至少包括处理器、存储器和网卡。接下来以工作节点20a为例对上述硬件的连接关系和工作方式进行具体介绍。
如下以工作节点20a为例进行描述,请参见图3,图3为工作节点20a的内部结构示意图。如图3所示,工作节点20a主要包括处理器201,存储器202,以及网卡203。处理器201,存储器202,以及网卡203通过通信总线完成相互间的通信。
其中,处理器201可以是中央处理器(central processing unit,CPU),可以用于对数据进行计算或处理等。存储器202,是指用于存储数据的装置,存储器包括内存、硬盘。其中,内存可以被随时读写数据,而且读取速度很快,可以作为正在运行中的程序的临时数据存储器。内存包含至少两种类型的存储器,例如随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)。与内存相比,硬盘读写数据的速度较慢,通常用于持久性地存储数据。硬盘类型至少包括固态硬盘(solid state disk或solid state drive,SSD)、机械硬盘(mechanical hard disk,HDD)或者其他类型的硬盘。一般的,硬盘中的数据需要先读入到内存中,处理器201或计算单元221从内存中获取数据。其中,处理器230的内存资源和计算单元221的内存资源可以是共享的,也可以是互相独立的,本申请实施例对此不 作限定。
网卡203,用于实现数据交互和数据处理。在硬件层面,网卡203至少包括通信单元220和计算单元221(图3是以一个计算单元为例示出,但本申请对此不做限定)。其中,通信单元220可以提供高效的网络传输能力,用于接收从外部设备输入的数据或发送本设备输出的数据。计算单元221包括但不限于:可编程门阵列(field programmable gate array,FPGA)、数据处理单元(data processing unit,DPU)、图形处理器(graphics processing unit,GPU)、特殊应用集成电路(application specific integrated circuit,ASIC)、系统级芯片(system on chip,SOC)。本实施例以FPGA为例予以说明。FPGA具有CPU的通用性和可编程性,但更具有专用性,可以在网络数据包,存储请求或分析请求上高效运行。FPGA通过较大程度的并行性(需要处理大量请求)与CPU区别开来。
本申请中,工作节点或中心节点可以部署在至少一个物理节点上,例如,一个工作节点和一个中心节点可以部署在同一个服务器上,再示例性地,一个中心节点和一个工作节点可以分别部署在互相独立的两个服务器上,此处不再一一列举。另外,存储节点也可以是一个独立的装置,例如存储服务器,需要说明的是,上述节点可以部署在物理机上,也可以部署在虚拟机,本申请对此也不做限定。
在实际应用中,图1所示的系统中还包括转发设备,例如交换机或路由器,为方便描述下文以交换机为例,交换机可以用于进行数据转发,在本实施例中,任意两个节点之间,例如中心节点和工作节点之间,中心节点和存储节点之间,工作节点与工作节点之间,以及工作节点与存储节点之间可以互联,实现彼此协同计算。
请参见图4,为本申请实施例提供的查询系统实际应用场景中的物理架构示意图。在图4所示的查询系统中,工作节点集群除了图3中的包括工作节点20a、工作节点20b、及工作节点20c,还包括工作节点20d、工作节点20e;中心节点集群只包括中心节点100;数据源包括存储节点30a、存储节点30b、存储节点30c;转发设备包括交换机10、交换机20、交换机30和交换机40。
如图4所示,中心节点、工作节点、存储节点均为互相独立的物理机。其中,工作节点20a、工作节点20b、中心节点100和交换机10安装于架子1(rack 1)上;工作节点20c、工作节点20d、工作节点20e和交换机20安装于架子2上;存储节点30a、存储节点30b、存储节点30c和交换机30安装于架子3上。
其中,同一架子内的节点之间可以通过架子内的交换机交互。举例来说,架子1内,中心节点100可以通过交换机10与架子1内的其他任意一个节点,例如与工作节点20a进行数据交互。示例性地,本申请实施例中,中心节点和工作节点之间交互的数据至少包括包头部分和数据部分,其中,包头部分包括源IP地址和目的IP地址,数据部分为需要传输的数据本身。示例性地,中心节点100向工作节点20a发送数据的过程可以包括:中心节点100向工作节点20a发送数据,该数据中的源IP地址为中心节点100的IP地址,目的IP地址为工作节点20a的IP地址,具体的,中心节点100发出该数据时,该数据首先被路由到交换机10上,交换机10根据该数据携带的目的IP将该数据转发至工作节点20a。
图4还包括交换机40,其中,交换机40为该系统中的核心交换机(core switch),相对于核心交换机来说,交换机10、交换机20或交换机30又可以称为各自架子内的顶部交换机(tor swtich)。核心交换机可以用于实现不同架子上的节点之间的数据交互。举例来说,架子1上的节点与架子3上的节点进行数据交互时,可以通过交换机10、核心交换机40和交换机30实现。举例来说,工作节点20a向存储节点30a发送数据时,该数据的传输路径为:该数 据首先被路由至交换机10,交换机10查看目的IP地址与自身的IP地址不在同一网段,则将该数据转发至核心交换机40,核心交换机40将该数据转发至与该目的IP地址同一网段的交换机30,交换机30将该数据转发至该目的IP地址对应的存储节点30a。该交换机40的安装位置不限定,例如可以安装于图4中的任一架子上。
本申请实施例中的交换机,除了具有数据转发功能之外,还具有计算、数据处理能力,例如可编程交换机。
需要说明的是,图4所示的系统架构仅为举例,请参见图5,为本申请实施例提供了另一种系统架构示意图,其中,存储节点也可以部署在工作节点上,例如,数据源中的数据可以存储在工作节点的硬盘中,本申请实施例对系统架构,以及各节点的部署形式不做限定。
总体来说,为解决背景技术中提到的问题,本申请提供了一种数据查询方法,中心节点接收客户端发送的查询请求,并将该查询请求解析为一个或多个任务,之后,中心节点可以将其中的一些任务卸载至网络设备进行处理,例如网络设备为工作节点的网卡或转发设备,相对于将任务分配至工作节点处理的方式,减轻了工作节点的计算量,减少了工作节点的CPU负担,从而可以在不增强工作节点的硬件资源的基础上,提高数据处理速度。
下面结合具体的附图和实施例对本申请实施例提供数据查询方法进行具体说明。
请参见图6,为本申请实施例提供的数据查询方法的流程示意图。在本实施例中,中心节点在多个任务中确定出可卸载任务,并指示工作节点将可卸载任务卸载至网卡进行处理。该方法可以应用于图4或图5所示的系统架构中,该方法主要包括如下步骤:
步骤601,客户端向中心节点发送查询请求,对应的,中心节点接收客户端发送的查询请求。
查询请求是用户在客户端触发的,例如,查询系统为MPP SQL引擎,查询请求可以是SQL语句,如下以SQL语句为例,对查询请求进行介绍,实际上,本申请实施例对查询请求的语句不做限定。
首先介绍如下两个列表,该两个列表为用户触发的查询请求中请求查询的原始数据,其中,表1(名称为factTbl),为某商家的商品销售记录,用于记录该商家的流水。表2(名称为dimTbl),为商品名称表,用于记录该商家出售的商品的标识和商品名称。应理解,表1和表2仅示出了部分数据。
表1.factTbl
Id(商品标识) | Sale(销售额)/元 | Data(日期) |
1 | 100 | 2020.01.01 |
2 | 150 | 2020.01.01 |
1 | 100 | 2020.01.02 |
3 | 200 | 2020.01.02 |
… | … | … |
表2.dimTbl
Id(商品标识) | Name(商品名称) |
1 | 水杯A |
2 | 水杯B |
3 | 水杯C |
4 | 水杯D |
… | … |
假设查询请求为,用户要查询表1中日期为2020/01/02这一天,表2中每一种商品分别的销售总额。示例性地,该查询请求对应的SQL语句如下:
SELECT dimTbl.name,sum(factTbl.sale)
FROM factTbl JOIN dimTbl
ON factTbl.ID=dimTbl.ID
WHERE factTbl.day=‘20200102’
GROUP BY dimTbl.name。
步骤602,中心节点将该查询请求解析为一个或多个任务。
在本实施例中,查询请求可以拆分为一个或多个任务,具体的,任务包括下列信息中的部分或全部:待操作数据的信息、算子信息、操作规则。其中,待操作数据的信息用于指示待操作数据,待操作数据为被操作的对象;算子信息包括算子的标识,该标识用于指示算子,一种算子表示一种执行操作;操作规则是指执行操作的规则,也可以理解为算子的规则。例如,上述SQL语句中“WHERE factTbl.day=‘20200102’”可以解析出过滤任务,该过滤任务中待操作数据为表1(factTbl),算子标识为filter(过滤)算子的标识,filter算子表示的执行操作为过滤,在该任务中该filter算子的规则为data=20200102。
下面对本申请实施例可能应用的一些算子进行解释说明。
1,tablescan(顺序表扫描)算子
tablescan算子,表示扫描操作,用于按照行在数据库中的存储顺序来读取表上所有页中的所有行。
2,Filter算子
Filter算子,表示过滤操作,用于按照操作规则(或过滤条件)对表中的过滤列进行过滤,得到符合过滤条件的行。其中,过滤列是指需要过滤的一列或多列。例如,过滤条件是过滤表1中Data=2020/01/02的数据,则表1中的Data列即为过滤列。具体的,在执行任务1时,可以在读取表1的每一行的同时根据过滤条件对每一行进行过滤,也可以读取部分行后,再基于该部分行执行过滤。
3,Join(表联合)算子
Join算子,表示表联合操作,用于根据某一列或多列的条件对两表进行重组联合,多用于按照小表(数据量相对较小的表,例如表2)中的某一项或多项数据来过滤大表(数据量相对较大的表,例如表1)中的数据,还可以将小表中的数据和大表中的数据相结合。
示例性地,使用Join算子的流程包括:在需要Join的小表的On条件上的列(例如表2的ID列)维护一个布隆过滤器(BloomFilter,BF),然后再扫描大表,在扫描大表的时候,将扫描出的每一行中On条件上的列(例如表1的ID列)的值与BF匹配,如果不存在于BF中,就丢弃这一行,如果存在,就保留。Jion算子可以用于将小表中的某些列与大表的某些列组合,例如,例如基于表1和表2中相同的ID值,将表1和表2的name列组合。
具体的,jion算子包括broadcast jion和hash jion,其中,假设需要jion的数据包括表2,broadcast jion的操作流程为,1个工作节点读取完整的表2,然后将该完整的表2广播至其他执行jion算子的每个工作节点处。而hash jion可以是,多个工作节点分别读取表2的一个或多个分片(下文会对分片进行详细介绍),然后将各自读取的分片分别发送至其他工作节点处,这样每个工作节点便可以基于其他工作节点读取的表2的分片得到完整的表2,之后执行基 于表2的jion。
4,group by(分组)算子
group by算子,表示分组操作,用于按照某条件进行分组,例如按照商品名称进行分组。
5,aggregation算子
aggregation算子,表示聚合操作,主要包括:Sum聚合算子、Min聚合算子、Max聚合算子、count聚合算子、AVG聚合算子,其中Sum聚合算子用于对需要聚合的值进行求和;Min聚合算子用于在需要聚合的值中维护最小值;Max聚合算子用于在需要聚合的值中维护最大值;count聚合算子用于对需要聚合的值的数量进行计数;AVG聚合算子用于在需要聚合的值中维护累积和的均值。举例来说,使用aggregation算子的执行流程为:首先根据group by列做分组,比如,group by dimTbl.name,即根据表2的name列做分组,然后对分组后的数据做sum、min、max、conut或Avg等操作。本实施例中是做sum操作。
6,distinct(非空唯一)算子
Distinct算子,表示去重操作,用于挑选出非空唯一列,或者说用于去掉重复数据。具体的,根据ditinct列上有数据(非空)的各行进行去重。举例来说,在确定表2中包含多少种商品时,表2中的name列为ditinct列,逐行扫描每一行中的name列,如果是未出现的name则记录该name,后续,如果该name再次出现,则不再重复记录,即name列中每种商品名称仅记录一次,这样便可以统计出表2共包含多个种商品。
7,TopN算子
TopN算子,表示维护最大N值操作,具体的,用于维护当前最大的N个值,当有新值进来时,如果该新值大于当前最大N个值中的最小值时,就替换当前最大值中的最小值。
接下来继续以上述步骤601中的SQL语句为例对任务进行阐述。应理解,执行SQL语句是按照一定顺序,分操作(或者说步骤)完成的,这些用来执行SQL语句的操作/步骤的组合就被称为执行计划。执行计划也可以用于表示SQL语句的完整执行过程。在软件层面,示例性地,中心节点接收客户端发送的SQL语句后,可以对该SQL语句进行语法解析,生成该SQL语句的执行计划,然后基于执行计划来解析得到一个或多个任务。示例性地,一个任务可以包括执行SQL语句的一个或多个操作,即一个任务可以使用一个或多个算子来执行。
示例性地,上述SQL语句的执行计划包括:(1)扫描表2,读取表2中的所有行;(2)扫描表1,读取表1中的所有行;(3)筛选出表1中日期为2020/01/02的行数据;(4)筛选出表1中ID列中与表2的ID值相同的行;按照相同的ID值,将表1中日期为2020/01/02的行与表2中的name列进行组合;(5)基于组合后的行数据,按照name进行分组,得到多组商品,分别计算每一组商品的总销售额。
更进一步地,结合任务被分配的工作节点,执行计划又可以被划分为多个阶段(stage),一个stage可以包括一个或多个任务。示例性地,可以按照节点之间是否需要进行交互来划分stage,同一stage包含的任务不需要依赖其他节点的结果。
请参见图7,为解析上述SQL语句生成的一种执行计划示意图。在用户输入查询语句后,即可生成图7所示的执行计划,具体的,用户可以在客户端的数据查询界面的查询命令输入区(图7未示出)输入查询语句。如图7所示执行计划可以通过中心节点或客户端的显示界面显示给用户。所述执行计划可以在用户输入查询指令后,直接显示给用户,也可以在用户需要查看执行计划时,通过输入显示执行计划的指令,然后通过界面显示所述执行计划给用户。进一步的,所述执行计划中还会显示每个任务的执行设备,例如为网卡、工作节点、路 由器、或者交换机等。所述执行计划可以跟所述任务同时显示,也可以在用户点击任务的时候,再显示所述执行设备。用户输入查询指令的界面与显示所述执行计划的界面可以为同一界面,也可以为不同界面。
如图7所示,该执行计划包括stage1,stage2,stage3,stage4。其中,stage3和stage4是并列关系,可以同步执行;stage2是stage3(或stage4)的下一个stage,对应的,stage3(或stage4)是stage2的上一个stage;stage1是stage2的下一个stage,对应的,stage1是stage2的上一个stage,依此类推。
基于上述SQL语句的执行计划可以拆分出如下任务:
任务1:扫描表1,并过滤出表1中日期为2020/01/02的行数据。其中,任务1可以使用tablescan算子和Filter算子来完成,tablescan算子执行操作为扫描。Filter算子执行操作为过滤操作,表1为待操作数据,过滤条件为表1data列中data=2020/01/02的数据。
任务2:读取表2。任务2可以使用tablescan算子来完成。
任务3:将表1和表2联合,具体的,执行任务3以在表1中日期为2020/01/02的行中筛选出ID列与表2的ID值相同的行;按照相同的ID值,分别与表2中的name列进行组合。其中,任务3可以使用Jion算子来完成。
任务4:分组任务,基于任务3得到的结果,按照商品名称进行分组。任务4可以使用group by(分组)算子完成。
任务5:部分聚合(partial aggregation):基于任务4的分组结果,对每一组商品的销售额进行求和,分别得到每一组商品的销售总额。任务5可以使用aggeration(聚合)算子来完成。应理解,每个工作节点会被分配处理表1中的一个或多个分片(下文会对分片进行详细介绍,此处不做重点介绍),也就是,每个执行任务5的工作节点仅是基于表1的部分数据对一组商品的销售额进行汇总,因此,任务5也可以理解为部分聚合。
任务6:最终聚合(final aggregation),即基于所有部分聚合结果确定最终的查询结果。在上述示例中为,待操作数据为每个被分配任务5的工作节点执行任务5的结果,执行操作为求和,执行规则是基于每个执行任务5的工作节点执行任务5的结果,对同一种商品的销售额进行求和计算,得到最终的查询结果。
继续参考图7举例来说,任务之间的逻辑关系为,任务3为任务1和任务2的下一个任务,任务1和任务2分别是任务3的上一个任务,任务4为任务3的下一个任务,对应的,任务3是任务4的上一个任务;即任务1和任务2的输出数据为任务3的输入数据,任务3的输出数据为任务4的输入数据,依次类推。对应的,执行下一个任务的节点为本节点的下一级节点,例如,执行任务3的节点为执行任务1的节点的下一级节点,依此类推。
如下对任务进行具体介绍。
在本申请实施例中,不同的任务具有不同的任务标识(request ID)。任务标识,用于唯一标识一个任务,在同属于一个查询请求的多个任务中,每个任务的任务标识均不相同。对于任务的执行设备来说,输入数据为本任务待计算的数据,可以根据任务标识来识别,具体的,包含本任务的任务标识的数据为输入数据。基于输入数据执行任务的结果为任务的输出数据,输出数据的任务标识为下一个任务的任务标识,具体的,携带输出数据的数据包中同时携带下一个任务的任务标识。
其中,同一个任务可以使用一个或多个算子来执行,当需要使用多个算子来执行时,任务中算子的排列顺序表示算子的执行顺序。任务的执行设备根据算子的顺序来执行对应的操作,不同算子之间不需要使用任务标识来传递算子的执行结果,对于当前算子的执行结果, 执行设备可以直接使用下一个算子对该执行结果进行处理。例如,任务1,tablecan算子之后为filter算子,则任务1的执行设备首先读取表1,之后使用filter算子,基于filter算子的过滤条件对表1进行过滤。任务中的最后一个算子的执行结果为该任务的输出数据。
步骤603,中心节点确定该一个或多个任务中的可卸载任务,并确定可卸载任务的执行设备。
为了实现并行处理,后续,中心节点生成任务调度计划,将上述任务分配至多个执行设备,由多个执行设备并行执行一个或多个任务。例如,将任务1分配给多个工作节点,每个工作节点分别读取表1的一个或多个分片,其中,分片,是指将待查询的数据划分为等大小的分片,例如,表1包括10000行,每2000行顺序划分为一个分片,则表1可以划分为5个分片。这就是所谓的并行处理,以此提高任务的执行效率。
在本申请实施例中,任务包括可卸载任务、不可卸载任务,不同类型的任务的执行设备可以是不同的。示例性地,不可卸载任务可以由工作节点处理,可卸载任务可以卸载至网络设备处理,例如网络设备为工作节点的网卡,这样便减少了工作节点的工作量,同时也减轻了工作节点的CPU的运算量和CPU负担。
示例性地,本实施例中可卸载任务可以为包含可卸载算子的任务,可卸载算子可以是预设的,或协议预定的。其中,可卸载算子包括但不限于:tablescan算子、filter算子、jion算子、aggregation算子、TopN算子、distinct算子等。需要说明的是,上述可卸载算子仅为举例,本申请实施例对可卸载算子的类型或数量不作限定。另外,若任务包含多个算子,且其中部分算子不是可卸载算子,则该任务可以定义为不可卸载任务。在实际应用中,对于需要使用不可卸算子执行的任务可以定义为一个单独的任务。也就是说,本实施例中的可卸载任务涉及的算子全部为可卸载算子。
举例来说,基于上述可卸载算子,对于任务1至任务6,任务3包含可卸载算子,因此,任务1是可卸载任务,任务1的执行设备可以是工作节点的网卡。任务4使用的算子为不可卸载算子,因此任务4为不可卸载任务,任务4的执行设备可以是工作节点本身。
接下来对如何生成分片进行说明:
应理解,并不是每个任务都需要去数据源获取数据,也即不是每个任务都需要生成分片信息。对于需要读表的任务,例如需要使用tablescan算子的任务,中心节点在为这类任务分配执行设备时,还可以为每个执行设备分配读取的分片。以表1为例,对生成表1的分片信息的过程可以是:中心节点从数据源获取表1的存储信息,例如,表1存储在哪些存储节点上,以及每个存储节点存储的表1的数据大小,表1的数据在每个存储节点上的存储位置,该存储节点的IP地址等信息。中心节点基于该存储信息生成表1的分片信息,每个分片的分片信息包括该分片所在的存储节点的IP地址、存储位置等信息。
举个例子,假设表1包含10000行,表1中第1行至第4000行存储在存储节点1上,第4001行至第8000行存储在存储节点2上,第8001行至第10000行存储在存储节点3上。给定每个分片为2000行,则表1可以划分为5个分片,例如分别为分片1至分片5,对应的,分片1的分片信息包括但不限于下列中的部分或全部:分片1的标识,存储节点1的IP地址,存储位置(表1的第1行至第2000行存储在存储节点1的地址空间,例如,可以表示为所述地址空间的首地址及第1行至第2000行的长度);分片2的分片信息包括但不限于下列中的部分或全部:分片2的标识,存储节点1的IP地址,存储位置(表1的第2001行至第4000行存储在存储节点1的地址空间,例如,可以表示为所述地址空间的首地址及第2000行至第4000行的长度),依此类推,此处不一一展开描述,当然,上述仅为举例,表1也可以仅存 储在一个存储节点上,本申请实施例对此不做限定。
接下来对如何生成任务调度计划进行说明:
示例性地,中心节点基于SQL包括的多个任务(或者上述执行计划)的信息、工作节点的信息、分片信息等生成任务调度计划。
其中,任务的信息包括任务标识,或者任务的信息包括任务标识和卸载标记等。其中,卸载标记,用于指示第一设置指令里携带的任务标识对应的任务是否为可卸载任务,示例性地,卸载标记可以是1bit,例如,该1bit的比特值为1表示为可卸载任务,若为0则表示为不可卸载任务。再示例性地,卸载标记为一固定值,可卸载任务中携带该卸载标记,不可卸载任务中不携带卸载标记。
工作节点的信息包括工作节点的数量、地址(例如IP地址、端口)、工作节点的标识等信息;其中,工作节点的标识可以为全局唯一的,所谓全局唯一,是指它所指示的工作节点在查询系统中是唯一的,并且每个工作节点以及中心节点都知道该标识的含义。该标识可以是该工作节点的IP地址、设备标识或设备名称、或者是中心节点为查询系统中每个工作节点生成的唯一标识等。交换机的信息包括交换机的地址(例如IP地址、端口等)、是否具备处理可卸载任务的能力、交换机的标识等信息。分片信息请参见前文的介绍,这里不再赘述。
具体的,任务调度计划包括下列中的部分或全部:任务标识、卸载标记、任务被分配的工作节点的标识、任务对应的分片信息等等。参见表3,为本申请实施例针对上述SQL语句提供的一种的任务调度计划的具体示例,这里以将任务全部分配至工作节点处理为例。假设分配工作节点20a读取完整的表2,后续,工作节点20a将表2广播至其他每个工作节点。
表3
示例性地,基于表3可以得出,中心节点将任务1至任务4分别分配到工作节点20a至20e,并分配工作节点20a读取分片1,让工作节点20b读取分片2,让工作节点20c读取分片3,让工作节点20d读取分片4,让工作节点20e读取分片5。这样,每个工作节点分别执行任务1时,能够并行读取表1的部分行,且互不干扰,并基于读取到的数据执行后续任务,直至任务5得到部分聚合结果,最后由执行任务6的节点(工作节点20a)对工作节点20a至20e的部分聚合结果进行汇总,得到最终的查询结果。
作为一种可实施的方式,查询请求执行的过程中,用户也可以随时查看查询请求的执行进度,如图8所示,在选中执行计划中的任一个任务后,可以显示当前任务的具体执行信息,例如该任务被分配至哪些节点上,是否为可卸载任务,可卸载任务的执行设备,节点或执行 设备的信息(假设工作节点20a至工作节点20e的IP地址分别为76.75.70.14-18),执行状态(例如包括未启动执行、正在执行中、执行完成)等。需要说明的是,图7和图8所示的界面仅为一种示意,也可以是其他方式显示,本申请实施例对此不做限定。
步骤604,中心节点发送可卸载任务的第一设置指令至工作节点,该工作节点的网卡被设置为可卸载任务的执行设备。
中心节点根据任务调度计划,将各任务的第一设置指令分别发送至被设置处理该任务的工作节点上。
示例性地,中心节点可以以任务为粒度,基于任务调度计划为每个工作节点生成第一设置指令,并向工作节点发送可卸载任务的第一设置指令。对应的,工作节点接收中心节点发送的第一设置指令。示例性地,该第一设置指令包括但不限于下列信息中的部分或全部:任务标识、卸载标记、算子信息。如下表4所示,为本实施例提供的一种第一设置指令的格式示例。
表4
其中,算子信息包括但不限于下列中的部分或全部:
算子标识、算子的执行规则、算子的输入信息、算子的输出信息。其中,算子的输入信息,用于指示执行任务所需的输入数据,该输入数据所在的节点的信息,例如地址信息、存储信息、表名称等;例如,上文表1的分片信息,为任务1的输入信息。算子的输出信息,包括Request ID对应的任务的下一个任务的任务标识,下一级节点的信息等。
类似的,不可卸载任务的第一设置指令与可卸载任务的第一设置指令可以是相似的,不同之处在于,可卸载任务的卸载标记指示该任务为可卸载任务,不可卸载任务的卸载标记指示该任务为不可卸载任务,或者只有可卸载任务的第一设置指令中携带卸载标记,不可卸载任务的第一设置指令中不携带卸载标记,以此来区分两者。
参见表5,以工作节点20a为例,表5列举了发送给工作节点20a的各任务的第一设置指令的具体示例,其中,各算子的执行规则参见上文介绍,表5中不再重复说明,另外假设Flags为1表示可卸载任务,Flags为0表示不可卸载任务。
表5
需要说明的是,上述第一设置指令的格式仅为举例,实际上,第一设置指令可以包含比表5更多或更少的信息,本申请实施例对此不做限定。例如,中心节点还可以不确定任务是否为可卸载任务,由每个工作节点根据预设算子来确定,则对应的,第一设置指令中可以不包括卸载标记。又例如,第一配置信还可以包含填充数据(Magic bytes),填充数据可以是已知比特的数据,例如0或1,以使第一设置指令的长度为预设长度。
步骤605,工作节点判断接收到的任务是否为可卸载任务,如果是,则执行步骤606,否则,该任务由工作节点处理。
如下以一个工作节点为例进行描述。
请结合表5理解,一种可实施的方式,工作节点对于接收到的任意一个第一设置指令,可以根据第一设置指令中携带的卸载标记判断该任务是否为可卸载任务,如果是,则在网卡中设置可卸载任务的信息,后续由网卡处理该可卸载任务。另一种可实施的方式,工作节点也可以根据是否携带卸载标记来区分可卸载任务和不可卸载任务。当然,这里是以中心节点来识别可卸载任务为例,如果如上文所述中心节点不识别可卸载任务,则不论可卸载任务还是不可卸载任务,第一设置指令中均不会携带卸载标记。这种情况下,工作节点可以根据预设的可卸载算子来识别可卸载任务,本申请实施例对此不做限定。
步骤606,工作节点将可卸载任务卸载至本节点的网卡,即在网卡中设置可卸载任务的信息。
具体的,工作节点在网卡中设置可卸载任务的信息时,可以向网卡发送任务的第二设置指令,网卡根据第二设置指令获取并记录可卸载任务的信息。
示例性地,该第二设置指令可以包括头部和数据部分,其中,头部可以包括控制指令、任务标识,数据部分可以包含该任务的算子信息。示例性地,参见表6,为本实施例提供的一种第二设置指令的格式。
表6
其中,Command,用于表示命令类型,或者说用于指示执行什么操作。示例性地,Command可以是但不限于下列几种类型:卸载命令(init command)、读命令(read command)、结束命令(end command)。其中,卸载命令,用于指示卸载Request ID所对应的任务。执行指令,用于指示启动tablescan任务,读取Request ID对应的任务的输入数据。一般的,读取待查询的数据为执行SQL的起点,因此,该命令可以称为读命令或执行指令。结束命令,当任务执行完毕后,可以通过结束命令指示执行设备释放用于处理Request ID对应的任务的资源,或者也可以理解为指示该任务已结束,分配给该任务的资源可以释放了。为方便描述,下文将command为init command的第二设置指令称为卸载命令;将command为read command的第二设置指令称为执行指令;将command为end command的第二设置指令称为结束命令。payload,包括任务的算子信息,算子信息在前面已经介绍过了,这里不再重复说明。
接下来对工作节点在网卡设置可卸载任务的信息流程可以包括:
以任务1为例,工作节点20a确定任务1为可卸载任务后,向工作节点20a自身的网卡发送任务1的卸载命令,如下表7所示。
表7
示例性地,网卡接收到该卸载命令后,首先检查该卸载命令的包头,查看包头,如果是init command,则网卡确定任务1(Request ID为1)为可卸载任务,为任务1分配(或者说保留)网卡资源,该网卡资源被配置用于处理任务1。
下面对网卡资源进行介绍。如图9所示,本实施例可以将用于处理可卸载任务的网卡资源(例如包括计算单元)和内存资源划分为多份,每一份称可以为一个处理引擎(processing engine,PE),一个PE可以被配置处理一个可卸载任务。
继续参考图9,以任务1为例,在网卡中设置可卸载任务的信息可以包括如下过程:网卡接收到任务1的卸载命令后,如果有空闲的PE,则网卡将任务1分配到一个空闲的PE上,该PE记录任务1的Request ID和算子信息等。相应的,网卡记录该PE与该PE处理的可卸载任务第一对应关系,以记录PE被分配执行哪个任务,具体的,第一对应关系包括PE标识和可卸载任务的Request ID。后续,当网卡接收到包含相同Request ID的数据包后,可以根据第一对应关系确定该Request ID对应的PE,并将该数据包路由至对应的PE处理。例如,在执行任务1时,工作节点20a的网卡向分片1对应的存储节点发送用于读取分片1的读请求,该读请求携带任务1的Request ID,存储节点向网卡发送任务1的反馈数据包,并且这些反馈数据包中也携带任务1的Request ID,这样网卡便可以根据该第一对应关系,确定存储节点返回的数据包对应的PE,并将该数据包路由至确定的PE上,PE根据记录的算子信息使用对应的算子对该数据包进行处理。
应理解,PE的数量是有限的,示例性地,网卡中还可以设置循环队列,一种可实施的方式,循环队列中能够放置的可卸载任务的数量可以与PE的数量相等,当有新的可卸载任务进来,且循环队列未满时,则将可卸载任务放入循环队列,并为该可卸载任务分配一个空闲的PE;当循环队列满了时,网卡向发送该卸载命令的设备发送响应,该响应用于指示网卡不能处理该可卸载任务,还可以包括不能处理的原因,例如网卡没有处理可卸载任务的资源等。举例来说,工作节点20a的处理器向本地网卡发送任务1的卸载命令之后,循环队列已满,网卡确定没有空闲的PE处理该任务时,向工作节点20a的处理器发送响应,指示网卡不能执行该任务1,后续,可以由工作节点20a执行该任务1,以此减少时延,提高任务的处理速度。另一种可能的实施方式,网卡将接收到的全部可卸载任务均放置到循环队列中,所有的可卸载任务均可以放置到循环队列中,若可卸载任务的数量多于PE的数量时,则当出现空闲的PE时,在循环队列中选择一个还未分配PE的可卸载任务分配该空闲的PE。
上述是以任务1为例,介绍了将可卸载任务卸载至网卡的过程,类似的,工作节点20a依照相同的方式将任务2、任务3、任务5、任务6分别卸载至工作节点20a自身的网卡上。应理解的是,其他工作节点的卸载流程与工作节点20a类似,这里不再重复说明。
步骤607a,中心节点向工作节点的网卡发送执行指令,对应的,工作节点的网卡接收执行指令。
在本申请中,某些任务需要接收到的执行指令后才可以执行,例如,需要使用tablescan算子的任务,即上述示例中的任务1和任务2。对于这类任务,中心节点可以发送该任务的执行指令来触发执行设备执行该任务。
示例性地,这里的执行指令可以是上述的读命令,中心节点向工作节点发送任务1和任务2的读命令。第一种实施方式,一个读命令中可以携带多个任务的Request ID,例如该读命令的Request ID中携带任务1的Request ID和任务2的Request ID。也就是说,任务2和任务1的读命令可以是同一个。另一种实施方式,每个任务的读命令是独立的,例如,任务1的读命令中仅携带任务1的Request ID,同理,任务2的读命令中仅携带任务2的Request ID。本申请实施例对此不做限定。
为便于描述,如下以第二种实施方式为例进行描述。以任务1的读命令为例,参见表8,为本申请实施例针对任务1提供的一种读命令的具体示例。
表8
Command | Request ID | Payload |
Read command | 1 | 分片1的分片信息 |
需要说明的是,表8仅为举例,如果任务1的卸载命令中携带了任务1的分片信息,则任务1的读命令中可以不重复携带任务1的分片信息,以此减少需要传输的数据量,避免重复传输,节省网络资源。或者,不论卸载命令中是否包含分片信息,读命令中均可以携带分片信息,并且执行设备以读命令中的分片信息为准,以达到动态、灵活调整分片信息的作用,提高数据命中率。
可选的,当工作节点根据第一设置指令设置完成任务的信息后,可以向中心节点发送完成响应,中心节点接收到完成响应后再发送执行指令。或者,中心节点也可以直接发送执行指令,当工作节点根据第一设置指令设置完成任务的信息后,直接启动执行对应的任务,以实现后续任务的自动执行。
步骤607b,工作节点的网卡接收本工作节点或其他节点的数据。类似的,这里的其他节点可以是本节点以外的其他工作节点、中心节点、存储节点或转发设备。例如,对于工作节 点20a,网卡接收其他工作节点执行的任务5得到的结果。需要说明的是,该步骤607b为可选的步骤,并非为必须执行的步骤,且步骤607b与步骤607a没有严格是时序限定。
步骤608,网卡判断接收到的数据是否卸载至本网卡的任务的,如果是,则执行步骤609;否则,执行步骤610。
后续,任务被启动执行后,网卡监控接收到的数据是否为设置自身处理的可卸载任务的输入数据,如果是,则对该数据进行运算,否则,将该数据转发至工作节点。
这里的数据包括各种设置指令和待运算的数据。结合步骤607a,网卡接收到执行指令时,判断接收到该执行指令是否为卸载至本网卡的任务的,如果是,则网卡启动执行对应的任务;否则,网卡将该执行指令发送至工作节点,由工作节点启动执行对应的任务。
同理,各任务的执行设备对接收到的数据进行监测,并判断该数据是否为卸载至本网卡的任务的数据,例如,若该数据包含本网卡的任务的request ID,则该数据为该任务的数据。若该数据为执行指令,则网卡执行该执行指令。若该数据为任务的输入数据,则网卡使用该任务的算子和执行规则对该数据进行处理。如果不是,则确定该数据不是该任务的,将该数据发送至工作节点进行处理。
步骤609,网卡执行对应的任务,并返回结果至执行计划的下一级节点或下一个任务的执行设备。
步骤610,网卡将该数据发送至工作节点。
步骤611,工作节点根据该数据执行对应的任务,并将结果返回给执行计划的下一个任务的执行设备或下一级节点。
需要说明的是,步骤608至步骤611可能为循环执行的步骤,直至得到最终的查询结果,即任务6的结果为止。
举例来说,结合表3所示,以工作节点20a为例,假设工作节点20a将任务1、任务2、任务3、任务5、任务6分别卸载至本地网卡的PE0至PE5处理,即第一对应关系为:PE0对应任务1,PE1对应任务2,以此类推,PE4对应任务6。任务4由工作节点20a处理,如下对工作节点20a执行任务1至任务6的流程进行描述:
工作节点20a的网卡接收到该任务1的执行指令后,根据上述第一对应关系确定任务1对应的PE,即PE0;将该任务1的执行指令路由到PE0上,PE0执行任务1:即根据任务1的(分片1的)分片信息向对应的存储节点发送读请求。示例性地,该读请求可以是现有实现机制中的读请求也可以是其他格式的读请求。具体的,PE0可以将任务1的读请求转发到分片1对应的存储节点,存储节点接收到该读命令后,向网卡返回任务1对应的分片1的数据,如前所述,存储节点返回的数据中包含与读请求相同的Request ID。
网卡从存储节点1接收到任务1的数据包后,根据第一对应关系,将该数据包发送至对应的PE0上。PE0接收到数据包后,根据任务1的算子信息确定tablescan算子的下一个算子为filter算子,并基于filter算子的执行规则(data=2020/01/02)对该数据包中携带的分片1中的过滤列(data列)进行过滤。
具体的,过滤结果可以以过滤列对应的bitmap来表示,bitmap中的每一位顺序对应于读取的分片中的每一行,并通过比特位上的不同比特值来表示该行是否满足过滤条件。参见如下表9,假设表9为表1的分片1中的一部分数据。
表9
Id | sale | Data |
… | … | 2020.01.02 |
… | … | 2020.01.01 |
… | … | 2020.01.01 |
… | … | 2020.01.01 |
… | … | 2020.01.02 |
PE0判断读取的数据列是否为过滤列,如果是过滤列,则使用过滤条件对读取的过滤列上的每一行的数据进行过滤,根据任务1的过滤条件Data=2020/01/02,可以确定,Data为过滤列,假设比特值为1表示满足过滤条件,比特值为0表示不满足过滤条件,则表9所示的过滤列对应的bitmap为10001。
网卡还可以存储Request ID和bitmap的第二对应关系,例如,Request ID=1,bitma为10001。之后,根据任务1的bitmap可以快速确定表9中仅第一行和第五行符合条件,这样,可以将过滤后的数据发送至下一级节点,能够减少传输的数据量,同时工作节点不需要执行该可卸载任务,也减少了工作节点的运行量和处理时间。
对于任务1的输出数据,其中该输出数据携带任务1的下一个任务的request ID,即request ID 3,以及任务1过滤后的数据。网卡继续根据第一对应关系,确定request ID3对应的PE,即PE2,将任务1的输出数据路由至PE2。
可选的,当任务执行完成后,网卡可以向工作节点发送指示信息,该指示信息用于指示该任务执行完成,工作节点接收到该指示信息后,工作节点可以向网卡发送结束命令,以使网卡释放处理该可卸载任务的相应的网卡资源(例如PE)、内存资源等。
如下以任务1的结束命令为例,请参见表10,为本申请实施例针对任务1提供的一种结束命令的具体示例。
表10
Command | Request ID | Payload |
end command | 1 | NULL |
举例来说,当网卡上任务1执行完成后,工作节点20a向网卡发送如下表10所示的结束命令,对应的,网卡接收到该结束命令,释放处理任务1的PE以及内存资源等,被释放的资源可以用于处理其他卸载到网卡上的可卸载任务。或者,网卡也可以自行决定何时释放PE,例如,存储节点向网卡发送任务1的最后一个数据包时,在该最后一个数据包中携带用于表示该数据包为最后一个数据包的标识,网卡确定PE处理完任务1的最后一个数据包后,释放用于处理任务1的相应资源。下文类似之处,不再重复说明。
后续,类似的,中心节点向工作节点20a的网卡发送任务2的执行指令,网卡根据第一对应关系确定request ID2对应的PE,即PE1,PE1可以基于表5所示的任务2的第一设置指令,根据表2的存储信息(包括存储节点的IP地址、存储位置等信息)获取完整的表2,并将读取的表2分别发送给工作节点20b、工作节点20c、工作节点20d和工作节点20e。同时,把表2路由至任务2的下一个任务,即任务3对应的PE2。
当任务3的输入数据全部到达时,PE2执行任务3,即PE2基于任务3对应的算子和执行规则对任务1和任务2的输出数据进行处理,得到任务3的输出数据。
PE2将任务3的输出数据(包含request ID 4)发送至任务4的执行设备,由于任务4为不可卸载任务,任务4的执行设备为工作节点本身,因此,该网卡可以将任务3的输出数据发送至工作节点20a的处理器,由工作节点20a来处理。
后续,工作节点20a将任务4的输出数据(包含request ID 5)发送至任务5的执行设备,具体的,工作节点20a将任务4的输出数据发生至网卡,网卡根据第一对应关系确定request ID 5对应的PE,即PE3,网卡将任务4的输出数据路由至PE3,依次类推,直至工作节点20a得到最终的查询结果。
需要说明的是,其他工作节点,例如工作节点20b,还需要将任务5的输出数据包发送至工作节点20a的网卡,当工作节点20a至工作节点20e上的任务5的输出数据包全部到达后,PE4执行任务6,得到最终的查询结果。其中,输出数据分为多个数据包传输时,最后一个数据包中还可以携带结束标识,通过结束标识来指示该数据包是否为当前request ID的最后一个数据包,接收端以此来判断对端工作节点上的数据是否传输完毕。
需要说明的是,上述第二设置指令的格式仅为举例,实际上,第二设置指令可以包含比上文所举的示例中更多或更少的内容,本申请实施例对此不做限定。例如,第二设置指令可以包括填充数据,以使第二设置指令的长度为预设长度。
另一种可实施的方式,对于可卸载任务,中心节点还可以根据任务调度计划,直接将可卸载任务的卸载命令发送被设置执行该任务的各执行设备,例如工作节点的网卡或转发设备。下面提供本实施例的另一种数据查询方法。
请参见图10,图10为本实施例提供的另一种数据查询方法所对应的流程示意图。在本发明实施例中,除了第一实施例中所描述的工作节点的网卡可以作为可卸载任务的执行设备外,网络中的转发设备,例如交换机及路由器也可以被设置为可卸载任务的执行设备。对于在工作节点的网卡中设置可卸载任务的方式与图6所示的实施例相同,即本实施例中的步骤1001至步骤1006、步骤1013至步骤1016可分别参考图6中步骤601至步骤606、步骤608至步骤611的相关描述,在此不再赘述,以下仅就不同之处进行说明。
下面将重点对可卸载任务为转发设备时,进行可卸载任务的设置方式进行描述。下文为便于说明,将以转发设备为交换机为例进行介绍。
本实施例还以图6的实施例中的SQL语句为例进行描述。如下表11所示,为针对上述步骤602中的SQL语句生成的另一种任务调度计划,该任务调度计划的可卸载任务的执行设备包括交换机,例如任务1的执行设备分别为交换机30,任务6的执行设备为交换机40。
表11
任务标识 | 可卸载任务 | 执行设备 | 存储信息 |
任务1 | 是 | 交换机30 | NULL |
任务2 | 是 | 工作节点20a的网卡 | 表2的存储信息 |
任务3 | 是 | 工作节点20a的网卡至工作节点20e的网卡 | NULL |
任务4 | 否 | 工作节点20a至工作节点20e | NULL |
任务5 | 是 | 工作节点20a的网卡至工作节点20e的网卡 | NULL |
任务6 | 是 | 交换机40 | NULL |
应理解表11仅为一种示例,并不构成对本申请任务调度计划的限定。
对于在步骤S1003中确定的可卸载任务的执行设备为交换机时,中心节点对可卸载任务 的设置请参考步骤1007及步骤1008的描述。
步骤1007:中心节点将可卸载任务的设置指令发送至交换机,该交换机被设置为可卸载任务的执行设备。
实际上,中心节点、存储节点或工作节点发出的数据包均会首先路由到各自对应的网段的交换机上。为了区分要转发的数据包和配置给交换机的数据包,在本实施例中,交换机包含至少两个端口,分别为数据端口和控制端口,当交换机通过数据端口接收到数据包时,则表示该数据包为需要转发的数据包,交换机根据数据包的目标IP地址将其转发至目的IP地址对应的设备上。若交换机通过控制端口接收到数据包,则说明该数据包为中心节点配置给交换机的数据包,需要交换机根据该数据包进行配置,例如中心节点给交换机发送的任务的设置指令。
结合表11所示,中心节点会将任务1的设置指令发送至交换机30;中心节点将任务6的设置指令发送至交换机40。
示例性地,中心节点可以将可卸载任务的第一设置指令发送至交换机的控制端口,以指示交换机根据该第一设置指令设置可卸载任务的信息。再示例性地,中心节点可以将可卸载任务的第二设置指令(卸载命令)发送至交换机的控制端口,以指示交换机根据该卸载命令设置可卸载任务的信息。如下以设置指令为卸载命令为例进行描述。
步骤1008:交换机设置可卸载任务的信息。
对应的,交换机30通过自身的控制端口接收到任务1的卸载命令时,根据该卸载命令记录任务1的信息(包括任务1的算子信息和request ID 1等)。同理,交换机40通过控制端口接收到任务6的卸载命令时,根据该卸载命令记录任务6的信息(包括任务6的算子信息和request ID 6等)等。后续,被设置处理可卸载任务的交换机监控接收到的每一个数据包的Request ID是否为卸载至本地的任务的,如果是,则由交换机对该数据进行处理;否则,交换机将该数据转发至该数据的目的IP地址处。
步骤1009a,中心节点发送任务1、任务2的执行指令。
这里的执行指令可以是上文中的启动命令。
如前所述,中心节点、存储节点或工作节点发出的数据包均会首先路由到各自对应的网段的交换机上。也即中心节点发送的执行指令也会首先被路由至交换机上。
步骤1009b,交换机接收其他节点的数据。
其他节点可以是工作节点、中心节点、存储节点或其他转发设备。
步骤1010,交换机判断接收到的数据是否为卸载至交换机的任务的,如果是,则执行步骤1011,否则,执行步骤1012a。
交换机接收到的数据包括设置指令、执行指令、存储节点发送的表1或表2的分片数据,或其他工作节点执行任务得到的输出数据。其中,步骤1010可以参考步骤608中网卡执行具体操作,此处不再赘述。
应理解,这里交换机会首先接收到中心节点发送的任务1、任务2的执行指令,交换机判断任务1和任务2是否为卸载至交换机的任务,如果不是,则将任务1、任务2的执行指令分别转发至目的IP地址对应的设备处。
应理解,步骤1010也可能为循环执行的步骤,直至该查询系统得到最终的查询结果。
步骤1011,交换机执行该任务,并返回任务的执行结果至执行计划中的下一级节点。
下面对交换机执行任务的流程进行简要说明:
以任务1为例,首先参见如下配置:(工作节点20a,分片1)、(工作节点20b,分片2)、 (工作节点20c,分片3)、(工作节点20d,分片4)、(工作节点20e,分片5)。
中心节点基于上述配置,分别向工作节点20a发送任务1的第一执行指令,通过第一执行指令指示工作节点20a读取分片1的数据;类似的,向工作节点20b发送任务1的第二执行指令,通过第二执行指令指示工作节点20b读取分片2的数据,依次类推。
以工作节点20a为例,工作节点20a接收到第一执行指令,根据分片1的分片信息向分片1对应的存储节点发送读请求(包含request ID 1),请结合图4理解,该读请求的传输路径为工作节点20a→交换机10→交换机40→交换机30→存储节点。
对应的,存储节点响应该读请求,发送该分片1的反馈数据包(包含request ID 1),该反馈数据包的目的IP地址为工作节点20a。同理,存储节点发出后,该数据包首先被路由到交换机30上,交换机30检测该反馈数据包是否为卸载至交换机30的任务的数据,即任务1的数据,如果是,则交换机30基于该反馈数据包执行任务1,否则,交换机30将该数据包发送至该数据包对应的目的IP地址处。
显然,交换机30接收到存储节点发出的分片1的数据包后,确定该数据包为任务1的数据包,随后,根据任务1的算子信息,基于filter算子的执行规则,使用filter算子对该数据包中的数据执行过滤操作,得到过滤结果也即任务1的输出数据,交换机30将该输出数据封装于输出数据包中,如前所述,该输出数据包中携带任务1的下一个任务的request ID,即request ID3,并根据从存储节点接收到的分片1的数据包中携带的目的IP地址,将该输出数据包发送至该目的IP地址处,即工作节点20a。
同理,对于交换机30与其他任一工作节点的交互方式,这里不再赘述。需要说明的是,上述通过执行指令设置算子信息的方式仅为举例,本实施例也可以将各工作节点与分片信息的对应关系通过任务1的设置指令,例如卸载命令发送给交换机30,由交换机30对表1各分片的过滤结果进行分布,以达到与上述示例相同的效果。
步骤1012a,交换机转发数据至对应的工作节点的网卡。
步骤1012b,工作节点的网卡接收本工作节点的数据。需要说明的是,该步骤1012b为可选的步骤,并非为必须执行的步骤,且步骤1012b与步骤1012a没有严格是时序限定。后续步骤S1013~S1016与图6中的步骤S608~S611相同,在此不再赘述。
对于表11所示的执行流程,其中,对于任务1及任务5的执行过程可以参见上文相关描述,此处不再赘述。执行任务6的交换机40为工作节点20a至工作节点20e的下一级节点,工作节点20a至工作节点20e分别将各自的任务5的输出数据(携带request ID 6)发送至交换机40,如果交换机40接收到所述数据,判断需要对所述数据执行任务6,则,根据任务6对应的算子和执行规则对该数据进行处理,得到最终的查询结果。之后,交换机40将得到的查询结果发送至中心节点,中心节点将该查询结果返回给客户端。
通过上述设计,将可卸载任务卸载至网络设备处理,减轻了工作节点的处理器的负担,进一步地,还可以减少在网传输的数据量。
需要说明的是,上述确定可卸载任务的方式仅为举例,本申请实施例对此不作限定。对于适合卸载至网卡或转发设备处理的任务,卸载至工作节点的网卡或者转发设备处理,可以减轻工作节点的处理器的负担,减少在网传输的数据量,但是在一些场景中,将任务卸载至网卡或交换机处理,可能会影响执行效率,所以在本发明另一种实施方式中,对于中心节点中预设的卸载任务,可进一步设置卸载策略,根据所设置卸载策略判断该卸载任务是否可以进行卸载,对于满足所述卸载策略的任务,才设置为可卸载任务。
为便于理解下列任务的卸载策略,首先说明一点:
数据源可以对存储的数据表进行分析,得到数据表的数据分布信息,其中,数据分布信息包括数据表中某一列数据的数据总量n,以及用于指示数据表中某一列数据在不同区间的数据分布信息,例如,某人员登记表中包含年龄列,则该表中年龄列的数据分布可以是年龄段1-10岁的人员数量(记为数量a),年龄段11-20岁的人员数量(记为数量b),年龄段21-30岁的人员数量(记为数量c)。
中心节点可以向数据源获取待查询的数据表的数据分布信息的请求,对应的,存储节点将中心节点所请求的数据分布信息发送给中心节点。基于该数据分布信息,中心节点可以粗略计算出filter算子的选择率、aggregation算子的聚合列的基数等,例如,某filter算子的执行规则为年龄在1-30岁的人员,则该filter算子的选择率为(a+b+c)/n。当然,这个数据分布情况是一个比较粗略的统计,据此得到的选择率或聚合列的基数并不是精确的选择率。下文涉及到该相关内容之处不再重复说明。
下面针对具体的任务分别进行举例说明。
1,filter算子;
在中心节点将filter算子设置为可卸载算子时,则相应的为filter算子设置的卸载策略可以为:filter算子的选择率较低例如低于第一预设值时,该filter算子所属的任务为可卸载任务。其中,选择率可以是根据满足过滤条件的行与待查询的原始数据包含的所有行的比值确定的。举例来说,以任务1为例,假设表1有10000行,其中满足日期为2020/01/02的行仅为10行,则选择率为10/10000*100%=0.1%。若第一预设值为1%,任务1的选择率低于第一预设值,则确定任务1为可卸载任务。应理解,第一预设值仅为举例,本实施例中所列举的所有具体数值均为举例,也对此不做限定。
如果选择率比较低,中心节点可以将该任务卸载至存储节点的网卡或交换机处理,这样可以减少在网传输的数据量。例如,存储节点和工作节点部署在不同的设备上时,中心节点还可以将任务1卸载至存储节点的网卡处理,存储节点仅需要向工作节点发送10行数据,并不需要将读取到的全部表1发送给工作节点,这样存储节点发送给工作节点的数据量就比较少,即网络中传输的数据量少,从而在减轻CPU的负担的基础上,还可以避免占用大量的网络带宽。或者,任务1还可以卸载至工作节点的网卡上处理,由工作节点的网卡执行过滤,不需要工作节点的CPU来执行该任务,从而减轻了CPU的负担,另外,网卡也不需要向CPU发送大量的数据,也减少了工作节点内的数据交互。
2,aggregation算子;
在中心节点将aggregation算子设置为可卸载算子时,则相应的为aggregation算子设置的卸载策略可以为:如果需要聚合的列的基数比较少,例如不超过第二预设值,则确定该aggregation算子所属的任务为可卸载任务。
举例来说,在计算表1中商品A的销售额时,sale列则为需要聚合的列,如果商品A有10行销售记录,则需要对该10行的sale值进行聚合,则该10行可以理解为需要聚合的列的基数。比如,若第二预设值为100时,则可以确定使用该aggregation算子执行的任务为可卸载任务。
对于aggregation算子类型的可卸载任务,可以卸载至工作节点的网卡,也可以卸载至交换机处理,例如,当aggregation任务需要聚合的列的基数比较少时,可以卸载至交换机来处理,这样即不会占用交换机较多计算资源,也可以减少网络中传输的数据量。
3,distinct算子
distinct算子的卸载策略与aggregation算子的卸载策略相似,具体的,distinct算子的卸载策略是基于distinct列的基数的大小,例如distinct列的基数不超过第三预设值,该distinct算子所属的任务为可卸载任务。distinct算子与aggregation算子的卸载策略的不同之处在于,aggregation算子是基于group by列做分组,然后对分组后的数据做sum、min、max、conut或Avg等操作。而distinct算子只需要对distinct列做分组就可以了。
对于distinct算子类型的可卸载任务,可以卸载至工作节点的网卡,也可以卸载至交换机处理。
4,动态过滤算子
动态过滤算子是指jion算子中两表联合时,通过小表对大表的筛选过滤。
下面以任务3为例,结合图4和图5所示的两种场景分别对任务3的Jion算子在这两种场景中动态过滤算子执行流程进行介绍。
如前所述,中心节点为工作节点分配了读取表1的一个分片,各工作节点执行任务1的流程参见前文的描述,这里不再重复说明。
由于表2需要与表1进行联合,即需要根据表2中的ID列上的值对表1中的ID列上的值进行筛选,查询表1中ID列上的值是否存在于表2中ID列中,因此每个工作节点均需要知道表2的每一行中的ID列的值,那么中心节点可以不将表2划分为不同的分片,让每个工作节点分别读取完整的表2。当然,为了提高查询效率,中心节点也可以将表2划分为不同的分片,让每个工作节点读取表2的一个或多个分片,之后,各工作节点可以将读取到的分片中的ID列发送给中心节点,由中心节点综合各工作节点返回的ID列,得到完整的表2中的ID列。
场景一:存储节点和工作节点融合的场景,即存储节点和工作节点部署在同一物理节点上,如图5所示的场景。
该场景中,该动态过滤算子更适宜卸载至工作节点的网卡,即工作节点的网卡为动态过滤算子的执行设备。示例性地,请结合图5理解,在该场景中,使用Jion算子对表1和表2进行联合的流程包括如下步骤:
1)被分配了任务2的每个工作节点,分别读取表2中的一个分片,并在该分片的数据的on条件上的列上创建BF,得到BF列并将得到的BF列发送给中心节点,该BF列可以包含重复的ID值,也可以用distinct算子对ID列进行去重,本申请实施例对此不做限定。
假设表2的分片也有5个,例如命名为分片11至分片12,工作节点20a至工作节点20e分别读取一个,在上述示例中,具体的过程可以是,工作节点20a读取分片11,并在读取到的分片11的数据的On条件上的列创建BF,并将得到的BF发送给中心节点100;工作节点20b读取分片12,并在读取到的分片12的数据的On条件上的列创建BF,并将得到的BF发送给中心节点100;工作节点20c读取分片13,并在读取到的分片13的数据的On条件上的列创建BF,并将得到的BF发送给中心节点100;依次类推。
2)中心节点合并各工作节点发送的BF,合并后得到完整表2的ID列。具体的,中心节点100接收来自工作节点20a至工作节点20e的各表2的分片的BF列,对各分片的BF列进行合并,得到完整表2的BF列,即ID列。
3)中心节点将得到的完整的表2的ID列发送给工作节点的网卡。
由于任务2的输出结果为任务3的输入数据,因此,中心节点可以将完整表2的BF列分别发送至任务3的各执行设备,例如,任务3的执行设备为工作节点20a至工作节点20e的网卡,则中心节点可以将完整表2的ID列分别发送给工作节点20a的网卡,工作节点20b 的网卡,工作节点20c的网卡,工作节点20d的网卡,工作节点20e的网卡。
需要说明的是,若该场景下任务3的执行设备为一个或多个交换机,则中心节点100可以将完整表2的BF列发送至该一个或多个交换机。下文为方便描述,以任务3的执行设备为网卡为例进行描述。
4)工作节点的网卡执行任务1,读取表1的一个或多个分片,并对读取的数据进行过滤得到data=20200102的行。
应理解,步骤4)与步骤1)至步骤3)是并行任务,这里的步骤不表示时序关系。
5)工作节点的网卡根据表2的BF列,再次过滤data=20200102的行中ID存在于表2的ID列中的行。
场景二:存储节点和工作节点分离部署的场景,即存储节点和工作节点分别部署在不同的物理节点上,如图4所示的场景。
该场景中,由于工作节点要与存储节点交互数据必须要经过两者之间的交换机,因此,该场景下,可以由该交换机对经过的数据进行过滤,减少在网传输的数据量,即交换机为该动态过滤算子的执行设备。
示例性地,请结合图4理解,假设任务1卸载到存储节点的网卡执行,任务3中的动态过滤卸载到交换机30执行,在图4所示的场景中,使用Jion算子对表1和表2进行表联合的流程包括如下步骤:
1)工作节点向存储节点发送任务2和任务1的读请求;
2)存储节点将工作节点读取的任务2的数据发送至交换机30,将工作节点读取的任务1的数据进行过滤后发送给交换机30;
3)交换机30基于各存储节点返回的任务2的数据创建BF,得到完整表2的BF列,以及,根据该BF列对接收到的表1过滤后的数据再次进行过滤,得到data=20200102的行中ID列等于表2的BF列的行。
4)交换机30将得到的data=20200102的行中,ID列等于表2的BF列的行发送给对应的工作节点。
5)工作节点使用交换机30过滤后的数据进行Jion。
上述步骤即为jion算子的动态过滤过程。在上述场景中,jion算子中的动态过滤可以卸载至工作节点的网卡执行,也可以卸载中心节点与工作节点的交互过程中经过的交换机执行,卸载策略是对大表过滤时的选择率确定。
在本实施例中,对于能够卸载到多个设备的可卸载任务,中心节点在确定该可卸载任务的执行设备时,还可以按照预设的可卸载任务对应的各设备的优先级来确定执行设备,例如,该执行设备可以为该可卸载任务对应设备中优先级最高的设备。比如,包括filter算子的可卸载任务,对应的设备的优先级排序为:存储节点的网卡、存储节点所在架子的交换机、工作节点所在架子的交换机、核心交换机、工作节点的网卡等。则基于此排序,可以确定存储节点的网卡可以是该任务1的执行设备。
需要说明的是,上述确定执行设备的方式仅为举例,也可以结合优先级和设备的负载状况来确定,例如,上述可卸载任务中虽然存储节点的网卡的优先级最高,但是若存储节点的性能比较低或负载比较高时,也可以不将任务1卸载至存储节点的网卡,并按照优先级顺次判断下一个设备是否可以作为执行设备。另外,将可卸载任务卸载至存储节点的网卡处理的方式可以参见上文工作节点将可卸载任务卸载至工作节点的本地网卡的具体流程介绍,下文不再重复说明。
当然,为了减轻网卡的负载,工作节点也可以基于预设的卸载策略选择性地将部分可卸载任务卸载至网卡,例如工作节点上预设的卸载策略可以是基于负载均衡原则制定的等等,这里不做重点说明。
基于与方法实施例同一发明构思,本申请实施例还提供了一种设备,用于执行上述方法实施例中图6或图10中中心节点执行的功能,如图11所示,该设备包括生成单元1101、处理单元1102和通信单元1103。
生成单元1101,用于接收客户端发送的查询请求,并将用户输入的查询请求解析为多个任务。具体实现方式请参见图6中的步骤601、602及图10中的步骤1001及1002的描述,此处不再赘述。
处理单元1102,用于确定多个任务中的可卸载任务,并确定可卸载任务的执行设备。所述网络设备可以是工作节点的网卡或者转发设备,转发设备包括:交换机、路由器。确定可卸载任务及可卸载任务的执行设备的具体方法请参考图6中的步骤603及图10中的步骤1003的描述,此处不再赘述。
通信单元1103,还用于发送每个任务的设置指令,以在每个执行设备上设置其所执行的任务,对于图6中的实施例,具体实现可参考步骤S604的相关描述,对于图10的实施例,可参考步骤S1004及1007的相关描述。
通信单元1103在通过发送每个任务的设置指令在每个执行设备上设置后对应的任务之后,即可发送该查询请求的执行指令,该执行指令用于触发执行设备执行所设置的任务,具体请参见图6中的步骤S607a及图10中的步骤1009a中的描述,此处不再赘述。
基于与方法实施例同一发明构思,本申请实施例还提供了一种网络设备,用于执行上述方法实施例图6或图10中网络设备(交换机、路由器或工作节点的网卡)执行的功能,如图12所示,该设备包括通信单元1201和处理单元1202。
通信单元1201,用于接收中心节点发送的设置指令,该设置指令用于在网络设备上设置执行查询请求时需要网络设备执行任务。当网络设备为网卡时,具体流程请参见图6中步骤607a、步骤610及步骤1012a、步骤1015的相关描述。当网络设备为转发设备时,具体流程请参见图6中步骤1009a、步骤1012a的相关描述。
图10中步骤1004的描述,此次不再赘述。
处理单元1202,用于根据设置指令设置任务,以及对流经网络设备的数据执行所述任务。当网络设备为网卡时,对于图6中的实施例,具体实现可参考步骤S608、步骤609的相关描述,对于图10的实施例,可参考步骤S1013及1014的相关描述。当网络设备为转发设备时,对于图10的实施例,可参考步骤S1010及1011的相关描述。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包括这些改动和变型在内。
Claims (27)
- 一种数据查询系统,包括中心节点、工作节点及网络设备,所述中心节点通过所述网络设备连接至所述工作节点,其特征在于:所述中心节点用于根据查询请求生成执行所述查询请求的多个任务,确定所述多个任务中的每个任务的执行设备,所述多个任务的执行设备包括所述工作节点及所述网络设备,发送所述每个任务的设置指令,所述设置指令用于在每个执行设备上设置其所执行的任务;所述工作节点及所述网络设备,用于执行所设置的任务。
- 如权利要求1所述的系统,其特征在于,所述中心节点在用于确定所述多个任务中的每个任务的执行设备时具体用于,查找所述多个任务中的可卸载任务,确定所述可卸载任务的执行设备为所述网络设备,所述可卸载任务为预设的卸载至所述网络设备执行的任务。
- 如权利要求2所述的系统,其特征在于,所述中心节点用于将所述可卸载任务的设置指令发送至所述网络设备;所述网络设备用于根据所述设置指令设置所述可卸载任务。
- 如权利要求1-3任意一项所述的系统,其特征在于,所述网络设备为工作节点的网卡或者转发设备,所述转发设备包括:交换机、路由器。
- 如权利要求4任意一项所述的系统,其特征在于,所述转发设备包括数据端口及控制端口,所述中心节点用于将执行设备为所述转发设备的可卸载任务的设置指令通过所述控制端口发送至所述转发设备;将执行设备为所述网卡或者所述工作节点的任务的设置指令通过所述数据端口发送至所述转发设备;所述转发设备用于根据从所述控制端口接收的设置指令设置所述可卸载任务,转发从所述数据端口接收的设置指令。
- 如权利要求4所述的系统,其特征在于,当执行所述可卸载任务的网络设备为所述工作节点的网卡时,所述中心节点用于将所述可卸载任务的设置指令发送至所述工作节点;所述工作节点根据所述设置指令在所述工作节点的网卡上设置所述可卸载任务。
- 如权利要求6所述的系统,其特征在于,所述可卸载任务的设置指令包括可卸载标记;所述工作节点在接收到所述设置指令后,在确定所述设置指令中包括所述可卸载标记时,在所述工作节点的网卡上设置所述可卸载任务。
- 如权利要求2-7任意一项所述的系统,其特征在于,所述网络设备在接收到数据包之后,当确定所述数据包中包括所述网络设备所执行的可卸载任务的标识时,则基于述数据包执行所述可卸载任务。
- 如权利要求3-7任意一项所述的系统,其特征在于,所述中心节点在用于发送所述可卸载任务的设置指令至所述网络设备时具体用于:在确定所述可卸载任务后,在确定所述可卸载任务符合所述可卸载任务对应的卸载策略时,发送所述可卸载任务的设置指令至所述网络设备。
- 一种数据查询方法,应用于中心节点,所述中心节点通过网络设备连接至工作节点,其特征在于,所述方法包括:根据查询请求生成执行所述查询请求的多个任务;确定所述多个任务中的每个任务的执行设备,所述多个任务的执行设备包括所述工作节点及所述网络设备;发送所述每个任务的设置指令,所述设置指令用于在每个执行设备上设 置其所执行的任务。
- 如权利要求10所述的方法,其特征在于,所述确定所述多个任务中的每个任务的执行设备包括:查找所述多个任务中的可卸载任务,确定所述可卸载任务的执行设备为所述网络设备,所述可卸载任务为预设的卸载至所述网络设备执行的任务。
- 如权利要求11所述的方法,其特征在于,所述可卸载任务的执行设备为所述网络设备时,所述发送所述每个任务的设置指令,包括:发送所述可卸载任务的设置指令至所述网络设备。
- 如权利要求11所述的方法,其特征在于,当确定执行所述可卸载任务的网络设备为所述工作节点的网卡时,所述发送每个任务的设置指令包括:将所述可卸载任务的设置指令发送至所述工作节点,指示所述工作节点为所述工作节点的网卡设置所述可卸载任务。
- 如权利要求13所述的方法,其特征在于,所述设置指令中包括可卸载标记,所述可卸载标记用于标记所述设置指令所设置的任务为可卸载任务。
- 如权利要求12-14任意一项所述的方法,其特征在于,所述发送所述可卸载任务的设置指令至所述网络设备包括:确定所述可卸载任务后,在确定所述可卸载任务符合所述可卸载任务对应的卸载策略时,发送所述可卸载任务的设置指令至所述网络设备。
- 一种数据查询方法,应用于网络设备,所述网络设备用于连接中心节点及工作节点,其特征在于,所述方法包括:接收设置指令,所述设置指令用于在所述网络设备上设置执行查询请求时需要所述网络设备执行的任务;根据所述设置指令设置所述任务;对流经所述网络设备的数据执行所述任务。
- 如权利要求16所述的方法,其特征在于,所述网络设备为工作节点的网卡或者转发设备,所述转发设备包括:交换机、路由器。
- 如权利要求17所述的方法,其特征在于,所述转发设备包括数据端口及控制端口;所述方法还包括:根据从所述控制端口接收的设置指令,在所述转发设备上设置所述设置指令所指示的可卸载任务;转发从所述数据端口接收的设置指令。
- 如权利要求16-18任一项所述的方法,其特征在于,所述对流经所述网络设备的数据执行所述任务包括:在接收到数据包之后,当确定所述数据包中包括网络设备所执行的所述可卸载任务的标识时,则基于所述数据包执行所述可卸载任务。
- 一种数据查询界面,其特征在于,包括:查询命令输入区,用于接收用户输入的查询请求;任务显示区,用于显示根据所述查询请求生成的执行所述查询请求的多个任务;执行设备显示区,用于显示每个任务的执行设备,所述执行设备包括工作节点及网络设备。
- 如权利要求20所述的界面,其特征在于;所述查询命令输入区、任务显示区、及执行设备显示区在同一界面显示。
- 如权利要求20所述的界面,其特征在于;所述查询命令输入区、任务显示区、及执行设备显示区在不同界面显示。
- 一种中心节点,其特征在于,所述中心节点包括用于执行权利要求10-16任意一项 所述方法的步骤的至少一个单元。
- 一种网络设备,其特征在于,该设备包括用于执行权利要求17-20任意一项所述方法的步骤的至少一个单元。
- 一种中心节点,其特征在于,包括存储器及处理器;所述存储器存储有程序指令,所述处理器运行所述程序指令,以执行权利要求10-15任一所述的方法。
- 一种网络设备,其特征在于,该设备包括处理单元和存储单元,所述存储单元中存储有可执行代码,所述处理单元执行所述可执行代码以实现权利要求16-19任意一项所述的方法。
- 一种网络设备,其特征在于,该设备包括:通信接口,用于与中心节点及工作节点进行数据传输;处理单元,用于对所述通信接口接收的数据进行处理,以执行权利要求16-19任意一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP21914419.3A EP4246340A4 (en) | 2020-12-29 | 2021-12-28 | SYSTEM, METHOD AND APPARATUS FOR QUERYING DATA USING A NETWORK DEVICE |
US18/342,547 US20230342399A1 (en) | 2020-12-29 | 2023-06-27 | System, method, and apparatus for data query using network device |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011588814.3A CN114756730A (zh) | 2020-12-29 | 2020-12-29 | 一种使用网络设备进行数据查询的系统、方法、及装置 |
CN202011588814.3 | 2020-12-29 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/342,547 Continuation US20230342399A1 (en) | 2020-12-29 | 2023-06-27 | System, method, and apparatus for data query using network device |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022143685A1 true WO2022143685A1 (zh) | 2022-07-07 |
Family
ID=82259084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/142146 WO2022143685A1 (zh) | 2020-12-29 | 2021-12-28 | 一种使用网络设备进行数据查询的系统、方法、及装置 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20230342399A1 (zh) |
EP (1) | EP4246340A4 (zh) |
CN (1) | CN114756730A (zh) |
WO (1) | WO2022143685A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115665073A (zh) * | 2022-12-06 | 2023-01-31 | 江苏为是科技有限公司 | 报文处理方法及装置 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116149856A (zh) * | 2023-01-09 | 2023-05-23 | 中科驭数(北京)科技有限公司 | 算子计算方法、装置、设备及介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102227718A (zh) * | 2008-11-26 | 2011-10-26 | 微软公司 | 用于远程桌面协议的硬件加速 |
CN103310011A (zh) * | 2013-07-02 | 2013-09-18 | 曙光信息产业(北京)有限公司 | 集群数据库系统环境下的数据查询解析方法 |
US20160285780A1 (en) * | 2013-03-18 | 2016-09-29 | Nederlandse Organisatie Voor Toegepast- Natuurwetenschappelijk Onderzoek Tno | Allocating Resources Between Network Nodes for Providing a Network Node Function |
CN108958744A (zh) * | 2018-06-21 | 2018-12-07 | 北京京东金融科技控股有限公司 | 大数据分布式集群的部署方法、装置、介质及电子设备 |
US20190354521A1 (en) * | 2018-05-18 | 2019-11-21 | Vitesse Data, Inc. | Concurrent Data Processing in a Relational Database Management System Using On-Board and Off-Board Processors |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9659057B2 (en) * | 2013-04-15 | 2017-05-23 | Vmware, Inc. | Fault tolerant distributed query processing using query operator motion |
US11281706B2 (en) * | 2016-09-26 | 2022-03-22 | Splunk Inc. | Multi-layer partition allocation for query execution |
-
2020
- 2020-12-29 CN CN202011588814.3A patent/CN114756730A/zh active Pending
-
2021
- 2021-12-28 WO PCT/CN2021/142146 patent/WO2022143685A1/zh unknown
- 2021-12-28 EP EP21914419.3A patent/EP4246340A4/en active Pending
-
2023
- 2023-06-27 US US18/342,547 patent/US20230342399A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102227718A (zh) * | 2008-11-26 | 2011-10-26 | 微软公司 | 用于远程桌面协议的硬件加速 |
US20160285780A1 (en) * | 2013-03-18 | 2016-09-29 | Nederlandse Organisatie Voor Toegepast- Natuurwetenschappelijk Onderzoek Tno | Allocating Resources Between Network Nodes for Providing a Network Node Function |
CN103310011A (zh) * | 2013-07-02 | 2013-09-18 | 曙光信息产业(北京)有限公司 | 集群数据库系统环境下的数据查询解析方法 |
US20190354521A1 (en) * | 2018-05-18 | 2019-11-21 | Vitesse Data, Inc. | Concurrent Data Processing in a Relational Database Management System Using On-Board and Off-Board Processors |
CN108958744A (zh) * | 2018-06-21 | 2018-12-07 | 北京京东金融科技控股有限公司 | 大数据分布式集群的部署方法、装置、介质及电子设备 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4246340A4 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115665073A (zh) * | 2022-12-06 | 2023-01-31 | 江苏为是科技有限公司 | 报文处理方法及装置 |
Also Published As
Publication number | Publication date |
---|---|
US20230342399A1 (en) | 2023-10-26 |
EP4246340A4 (en) | 2024-04-10 |
CN114756730A (zh) | 2022-07-15 |
EP4246340A1 (en) | 2023-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2022143685A1 (zh) | 一种使用网络设备进行数据查询的系统、方法、及装置 | |
US10169413B2 (en) | Scalable acceleration of database query operations | |
WO2021136137A1 (zh) | 一种资源调度方法、装置及相关设备 | |
US10530846B2 (en) | Scheduling packets to destination virtual machines based on identified deep flow | |
US8713182B2 (en) | Selection of a suitable node to host a virtual machine in an environment containing a large number of nodes | |
WO2020034646A1 (zh) | 一种资源调度方法及装置 | |
US8499222B2 (en) | Supporting distributed key-based processes | |
AU2013277589B2 (en) | Offloading virtual machine flows to physical queues | |
US8898422B2 (en) | Workload-aware distributed data processing apparatus and method for processing large data based on hardware acceleration | |
DE112017003294T5 (de) | Technologien für ein skalierbares Senden und Empfangen von Paketen | |
EP3172682B1 (en) | Distributing and processing streams over one or more networks for on-the-fly schema evolution | |
JP2008123040A (ja) | リソース割当方法、リソース割当プログラム、および、管理コンピュータ | |
US10599436B2 (en) | Data processing method and apparatus, and system | |
CN110226159B (zh) | 在网络交换机上执行数据库功能的方法 | |
EP3560148B1 (en) | Database functions-defined network switch | |
WO2020211717A1 (zh) | 一种数据处理方法、装置及设备 | |
US20180248977A1 (en) | Selective distribution of messages in a publish-subscribe system | |
KR101656706B1 (ko) | 고성능 컴퓨팅 환경에서의 작업 분배 시스템 및 방법 | |
US9773061B2 (en) | Data distributed search system, data distributed search method, and management computer | |
US20150149437A1 (en) | Method and System for Optimizing Reduce-Side Join Operation in a Map-Reduce Framework | |
US20160019090A1 (en) | Data processing control method, computer-readable recording medium, and data processing control device | |
WO2022247868A1 (zh) | 一种多子图匹配方法、装置及设备 | |
US11544260B2 (en) | Transaction processing method and system, and server | |
JP5108011B2 (ja) | バス接続されたコンシューマとプロデューサとの間でのメッセージ・フローを削減するためのシステム、方法、およびコンピュータ・プログラム | |
CN116383240A (zh) | 基于fpga多数据库加速查询方法、装置、设备及介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21914419 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021914419 Country of ref document: EP Effective date: 20230614 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |