CN111625696A - Distributed scheduling method, computing node and system of multi-source data analysis engine - Google Patents

Distributed scheduling method, computing node and system of multi-source data analysis engine Download PDF

Info

Publication number
CN111625696A
CN111625696A CN202010734766.8A CN202010734766A CN111625696A CN 111625696 A CN111625696 A CN 111625696A CN 202010734766 A CN202010734766 A CN 202010734766A CN 111625696 A CN111625696 A CN 111625696A
Authority
CN
China
Prior art keywords
query task
sub
result set
intermediate result
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010734766.8A
Other languages
Chinese (zh)
Other versions
CN111625696B (en
Inventor
李一哲
程度
张福
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shengxin Network Technology Co ltd
Original Assignee
Beijing Shengxin Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shengxin Network Technology Co ltd filed Critical Beijing Shengxin Network Technology Co ltd
Priority to CN202010734766.8A priority Critical patent/CN111625696B/en
Publication of CN111625696A publication Critical patent/CN111625696A/en
Application granted granted Critical
Publication of CN111625696B publication Critical patent/CN111625696B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a distributed scheduling method of a multi-source data analysis engine, a computing node device, a readable storage medium, a computing device and a distributed scheduling system of the multi-source data analysis engine, which save a large amount of communication cost and improve the distributed processing efficiency of the multi-source data analysis engine, and the method comprises the following steps: receiving a query task by a computing node with the highest scheduling index of the multi-source data analysis engine; the computing node determines a sub-query task comprising an intermediate result set of the query task and determines a storage node of the intermediate result set; the computing node computes a first time cost for migrating the intermediate result set to the local and a second time cost for executing the sub-query task by the storage node; and the computing node selects whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task according to the comparison result of the first time overhead and the second time overhead.

Description

Distributed scheduling method, computing node and system of multi-source data analysis engine
Technical Field
The invention relates to the technical field of databases, in particular to a distributed scheduling method of a multi-source data analysis engine, a computing node device, a readable storage medium, computing equipment and a distributed scheduling system of the multi-source data analysis engine.
Background
The multi-source data analysis engine is a query language used for interfacing various data sources like elastic search, Mongo, Mysql, etc. Taking a Qingteng Structured Query Language (QSL) engine as an example, the method can also provide the grammar of the pipeline to filter the Query result, and realize the continuous analysis capability of the data. The nodes of the engine store the intermediate result sets in the calculation process in a local Sqlite3 database. The engine does not have the distributed capability, and only can call the computing resources of a single node, so that the improvement of the computing power assembly and the concurrent query capability of the engine are greatly limited.
In order to improve the distributed processing efficiency of data, Hadoop has proposed a second generation job scheduling and cluster resource management system Yarn, which adopts a layered architecture including resource manager, application master, and node manager.
The ResourceManager is the core of the Yarn hierarchy, and it actually controls the entire cluster and manages the allocation of application programs and node computing resources. The ApplicationMaster manages each instance of an application running within YARN, and is responsible for coordinating resources from the ResourceManager and monitoring the execution of containers and resource usage (resource allocation of CPU, memory, etc.) through the NodeManager. The NodeManager provides services for each node in the cluster, from overseeing lifetime management of a container, to monitoring resources and tracking node health.
In terms of data management, it continues to use the hdfs (hadoop Distributed file system) layer as in the first generation system, NameNode stores metadata, and DataNode manages data Distributed to each node in a cluster.
Yarn solves the problems that the Hadoop first-generation operation scheduling system has poor expansibility, single point of failure and can only be limited to an MR (MapReduce) computing frame and the like. However, in terms of data management, the same HDFS as the first generation is still used, and in a mass data scenario above PB (byte, beat) level, the system plays an irreplaceable role as the only current solution, but there is also an optimization space in a data scenario at TB (Terabyte) level.
Under the data quantity of TB level, the intermediate result set of data calculation can be completely stored by a single node, and if the distributed storage scheme of HDFS data blocking is adopted at the moment, a large amount of communication with the DataNode can occur in the calculation process. In TB-level data computation, the communication cost is often greater than the computation itself, and therefore there is room for optimization in TB-level data computation.
Disclosure of Invention
To this end, the present invention provides a distributed scheduling method of a multi-source data analysis engine, a computing node device, a readable storage medium, a computing device, and a distributed scheduling system of a multi-source data analysis engine, in an effort to solve or at least alleviate at least one of the problems presented above.
According to one aspect of the invention, a distributed scheduling method of a multi-source data analysis engine is provided, which comprises the following steps:
receiving a query task by a computing node with the highest scheduling index of the multi-source data analysis engine;
the computing node determining a sub-query task of the query task that includes an intermediate result set, and determining a storage node of the intermediate result set;
the compute node computing a first time cost to migrate the intermediate result set to local and computing a second time cost to execute the sub-query task by the storage node;
and the computing node selects whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task according to the comparison result of the first time overhead and the second time overhead.
Optionally, determining a scheduling index of the multi-source data analysis engine includes:
obtaining the disk margin, the CPU margin, the memory margin and the query process margin of each computing node of the multi-source data analysis engine;
and determining the scheduling index of each computing node according to the disk margin, the CPU margin, the memory margin and the query process margin.
Optionally, the computing node calculates a first time overhead of migrating the intermediate result set to a local, including:
the computing node adopts an exponential weighted average method to recursively compute a first time cost for transferring the intermediate result set to the local; wherein a default value for the first time overhead is preconfigured.
Optionally, the determining, by the computing node, whether to migrate the intermediate result set to the local and execute the sub-query task or to execute the sub-query task by the storage node according to a comparison result between the first time cost and the second time cost includes:
the computing node calculating a ratio of the first time cost and the second time cost;
when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task;
and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
Optionally, comparing the resource margin with a preset second threshold includes:
comparing the disk margin of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
comparing the CPU allowance of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
comparing the memory allowance of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
and comparing the query process allowance of the storage node with a preset second threshold.
According to another aspect of the present invention, there is provided a computing node apparatus of a multi-source data analysis engine, including:
the query task receiving unit is used for receiving a query task;
the query task analysis unit is used for determining a sub-query task comprising an intermediate result set of the query task and determining a storage node of the intermediate result set;
the cost calculation unit is used for calculating a first time cost for migrating the intermediate result set to the local and calculating a second time cost for executing the sub-query task by the storage node;
and the sub-query task allocation unit is used for selecting whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task according to the comparison result of the first time overhead and the second time overhead.
According to yet another aspect of the present invention, there is provided a readable storage medium having executable instructions thereon that, when executed, cause a computer to perform the distributed scheduling method of the multi-source data analysis engine described above.
According to yet another aspect of the present invention, there is provided a computing device comprising: one or more processors; a memory; and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors to perform the distributed scheduling method of the multi-source data analytics engine described above.
According to another aspect of the present invention, a distributed scheduling system of a multi-source data analysis engine is provided, which includes a scheduling server and a plurality of computing nodes;
the scheduling server is used for determining the scheduling indexes of the plurality of computing nodes and distributing the query task to the computing node with the highest scheduling index;
the computing node with the highest scheduling index in the plurality of computing nodes is used for receiving the query task; determining a sub-query task of the query task comprising an intermediate result set, and determining a storage node of the intermediate result set; calculating a first time cost of migrating the intermediate result set to a local site, and calculating a second time cost of executing the sub-query task by the storage node; and according to the comparison result of the first time cost and the second time cost, selecting whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task.
Optionally, when the computing node with the highest scheduling index is configured to determine, according to a comparison result between the first time cost and the second time cost, whether to migrate the intermediate result set to the local and execute the sub-query task by the computing node or to execute the sub-query task by the storage node, the computing node with the highest scheduling index is specifically configured to:
calculating a ratio of the first time overhead and the second time overhead;
when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task;
and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
According to the embodiment of the invention, a computing node with the highest scheduling index of a multi-source data analysis engine receives a query task, the computing node determines a sub-query task comprising an intermediate result set of the query task and determines a storage node of the intermediate result set, the computing node calculates a first time cost for migrating the intermediate result set to the local and calculates a second time cost for executing the sub-query task by the storage node, and the computing node selects whether to migrate the intermediate result set to the local and execute the sub-query task or execute the sub-query task by the computing node according to a comparison result of the first time cost and the second time cost; the embodiment of the invention solves the problems that the QSL has single-point failure and cannot be expanded in a distributed way; and in the distributed solution, the intermediate result set is stored locally in the computing node generating the intermediate result set (also called as a storage node of the intermediate result set), and when subsequent computation occurs, data resources are coordinated according to communication overhead and computation overhead, and the storage node directly performs computation or the currently scheduled computing node completes computation, so that a large amount of communication cost is saved, and the efficiency of the distributed system is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a block diagram of an exemplary computing device.
FIG. 2 is a flowchart illustrating a distributed scheduling method of a multi-source data analysis engine according to an embodiment of the present invention.
FIG. 3 is a block diagram of a distributed scheduling system of a multi-source data analysis engine according to an embodiment of the present invention.
FIG. 4 is a flowchart illustrating a distributed scheduling method of a multi-source data analysis engine according to an embodiment of the present invention.
FIG. 5 is a block diagram of a compute node device of a multi-source data analytics engine according to an embodiment of the present invention.
FIG. 6 is a schematic structural diagram of a distributed scheduling system of a multi-source data analysis engine according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
FIG. 1 is a block diagram of an example computing device 100 arranged to implement a distributed scheduling method of a multi-source data analytics engine in accordance with the present invention. In a basic configuration 102, computing device 100 typically includes system memory 106 and one or more processors 104. A memory bus 108 may be used for communication between the processor 104 and the system memory 106.
Depending on the desired configuration, the processor 104 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a digital information processor (DSP), or any combination thereof. The processor 104 may include one or more levels of cache, such as a level one cache 110 and a level two cache 112, a processor core 114, and registers 116. The example processor core 114 may include an Arithmetic Logic Unit (ALU), a Floating Point Unit (FPU), a digital signal processing core (DSP core), or any combination thereof. The example memory controller 118 may be used with the processor 104, or in some implementations the memory controller 118 may be an internal part of the processor 104.
Depending on the desired configuration, system memory 106 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 120, one or more programs 122, and program data 124. In some implementations, the program 122 can be configured to execute instructions on an operating system by one or more processors 104 using program data 124.
Computing device 100 may also include an interface bus 140 that facilitates communication from various interface devices (e.g., output devices 142, peripheral interfaces 144, and communication devices 146) to the basic configuration 102 via the bus/interface controller 130. The example output device 142 includes a graphics processing unit 148 and an audio processing unit 150. They may be configured to facilitate communication with various external devices, such as a display terminal or speakers, via one or more a/V ports 152. Example peripheral interfaces 144 may include a serial interface controller 154 and a parallel interface controller 156, which may be configured to facilitate communication with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 158. An example communication device 146 may include a network controller 160, which may be arranged to facilitate communications with one or more other computing devices 162 over a network communication link via one or more communication ports 164.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
Computing device 100 may be implemented as part of a small-form factor portable (or mobile) electronic device such as a cellular telephone, a Personal Digital Assistant (PDA), a personal media player device, a wireless web-watch device, a personal headset device, an application specific device, or a hybrid device that include any of the above functions. Computing device 100 may also be implemented as a personal computer, a server, a cluster of multiple computers, including both desktop and notebook computer configurations.
Among other things, one or more programs 122 of computing device 100 include instructions for performing a distributed scheduling method of a multi-source data analytics engine in accordance with the present invention.
FIG. 2 illustrates a flow diagram of a distributed scheduling method 200 for a multi-source data analysis engine, according to one embodiment of the present invention, the distributed scheduling method 200 for a multi-source data analysis engine beginning at step S210.
In step S210, the computing node with the highest scheduling index of the multi-source data analysis engine receives the query task.
The method for determining the scheduling index of the computing node of the multi-source data analysis engine comprises the following steps: obtaining the disk margin, the CPU margin, the memory margin and the query process margin of each computing node of the multi-source data analysis engine; and determining the scheduling index of each computing node according to the disk margin, the CPU margin, the memory margin and the query process margin. Specifically, because the disk margin, the CPU margin, the memory margin and the query process margin jointly affect the computing capacity of the computing node for scheduling, the scheduling index of the computing node is obtained by calculating the product of the disk margin, the CPU margin, the memory margin and the query process margin and performing normalization processing, and the higher the scheduling index is, the more idle resources of the computing node are, the lower the scheduling index is, and the less available resources of the computing node are; and preferentially distributing the computing tasks to the computing nodes with high scheduling indexes.
Subsequently, in step S220, the computing node determines a sub-query task of the query task that includes an intermediate result set, and determines a storage node of the intermediate result set.
In the query process of the multi-source data analysis engine, each step of operation or calculation on data can be regarded as a sub-query of the query. The finer the granularity of sub-query resource control and optimization, the higher the performance of the query and the efficiency of resource utilization. The multi-source data analysis engine can generate an intermediate result set in the process of processing the sub-query task; in a query process, the calculation is not directly performed on objects and variables which are completely in the memory, but partial data is temporarily stored in a local disk of the computing node, and the data on the local disk can be regarded as a stage result in the query process and is called an intermediate result set. The storage node of the intermediate result set is also a compute node.
Subsequently, in step S230, the compute node computes a first time cost of migrating the intermediate result set to the local and computes a second time cost of executing the sub-query task by the storage node.
The first time overhead to migrate the intermediate result set to the local is obtained by dividing the data size of the intermediate result set by the network transmission speed. And the second time cost for the storage node to execute the sub-query task is obtained by performing grammar analysis on the query statement and combining an exponential weighted average method for recursive calculation.
Subsequently, in step S240, the computing node selects whether to migrate the intermediate result set to the local and execute the sub-query task by the computing node or to execute the sub-query task by the storage node according to the comparison result between the first time cost and the second time cost.
Specifically, step S240 includes: the calculation node calculates the ratio of the first time cost to the second time cost; when the ratio is larger than a preset first threshold, calculating the resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute a sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task; and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
Further, comparing the resource margin with a preset second threshold value comprises: comparing the disk margin of the storage node with a preset second threshold; or comparing the CPU allowance of the storage node with a preset second threshold; or comparing the memory allowance of the storage node with a preset second threshold; or comparing the query progress allowance of the storage node with a preset second threshold.
Specific embodiments of the present invention are presented below in conjunction with a multisource data analysis engine QSL.
As shown in fig. 3, the distributed scheduling system provided by the present invention includes a resource monitoring module, a computation scheduling module, a grammar analysis module, and an information synchronization module.
Firstly, a resource monitoring module.
The resource monitoring module is additionally arranged on each QSL engine node, and is used for quantitatively evaluating hardware resources (memory, hard disk and bandwidth) and logic resources (query task quantity) of each node, carrying out normalization processing, and uniformly reporting to the computation scheduling module through Transmission Control Protocol (TCP) communication in the cluster. The reported data of the resource monitoring module comprises:
NormDisk = Norm(disk);
NormCPU = Norm(cpu);
NormMemory = Norm(memory);
NormProcess = Norm(process)。
disk represents the disk residual, CPU represents the CPU residual, memory represents the memory residual, process represents the query process residual, and Norm represents the normalization result of the node resource information to represent the proportion of the resource residual of the current node.
And secondly, a calculation scheduling module.
The calculation scheduling module issues a query task to the computing node through calculation scheduling indexes (the indexes reflect the comprehensive level of the computing resources of each node and serve as the main basis of task scheduling) and through encrypted HyperText Transfer Protocol (HTTP) communication.
The scheduling index alpha is used for evaluating the comprehensive level of the residual resources of each node, and the calculation formula is as follows:
α= NormDisk * NormCPU * NormMemory * NormProcess。
wherein NormDisk represents the residual proportion of the disk, NormCPU represents the residual proportion of the CPU, Normmemory represents the residual proportion of the memory, and NormProcesses represents the residual proportion of the inquiry process. When the cluster newly receives a query task, the scheduling module performs positive sequence sequencing on the nodes according to the scheduling index alpha and distributes the query task to the first-ranked node for execution.
And thirdly, a grammar analysis module.
And the grammar analysis module of each node carries out secondary scheduling aiming at the sub-queries containing the intermediate result set in the query task. The scheduling determines whether to transmit the intermediate result set to the current node for calculation or to migrate the calculation to the node where the intermediate result set is located by calculating the communication calculation ratio of the query.
And the communication calculation ratio beta is used for evaluating the proportion between the communication cost and the calculation overhead of the intermediate result set sub-query task, and the calculation formula is as follows.
β= Communication / Computation。
If β <0.5, then the intermediate result set is intended to be transmitted to the current node for computation; otherwise, migrating the calculation to the node where the intermediate result set is located.
Wherein Communication represents the time overhead required for Communication in seconds. This value is obtained by obtaining the Size (Size) of the intermediate result set from the information synchronization module, compared to the cluster network transmission Speed (Speed), and the formula is as follows:
Communication = Size(data) / Speed。
computation represents the time overhead required for Computation in seconds. The system is internally provided with a default value of the Computation of each grammar, namely an initial value of the Computation when 100MB is taken as a data reference, and the Computation is recursively updated by using exponential moving weighted average during actual Computation, wherein the formula is as follows:
Computation = ComputationVt= 0.5 * ComputationVt-1+ 0.5 * ComputationTt
wherein computationTtRepresenting the actual cost of the last calculation, computationVt-1For the last calculated exponentially moving weighted average cost, computationVtThe current exponentially moving weighted average cost is used as the latest value for Computation during the next communication Computation.
In addition, if the scheduling decision is to migrate the calculation to the node where the intermediate result set is located, but the normalization value of any resource of the target node is lower than 0.1, the local scheduling plan is cancelled, and the intermediate result set is permanently copied to the current node for calculation. The specific operations of permanent copy are: and copying the intermediate result set to the current node, updating the node information of the intermediate result set in the information synchronization module, and deleting the data of the intermediate result set from the original node.
The specific flow of scheduling is shown in fig. 4.
And fourthly, an information synchronization module.
The information synchronization module of each node can ensure that the state of the node and the information of the intermediate result set are synchronized to other nodes in time in the process of secondary scheduling.
The information of the intermediate result set includes: the ID, size, storage node, creation time, and life cycle of the intermediate result set.
The QSL adopts Sqlite3 as a storage database, Sqlite3 has the characteristics of the existing database, and the complex query function can be realized by utilizing the SQL characteristic in the sub-query process; meanwhile, the method has the characteristics of files, does not depend on external environment, and is stored as a single file on a disk, so that subsequent data migration between nodes is facilitated. The intermediate result set of QSL is limited to millions by default, so the magnitude of data is mostly MB, and theoretically does not exceed GB even if this configuration is modified. In the data calculation process, the invention obviously reduces the communication cost of TB-level data and data below the TB level by optimizing the storage of the intermediate result set and the calculation scheduling of fine granularity, and improves the resource utilization rate.
Referring to fig. 5, the present invention provides a compute node device of a multi-source data analysis engine, comprising:
a query task receiving unit 510, configured to receive a query task;
a query task parsing unit 520, configured to determine a sub-query task of the query task that includes an intermediate result set, and determine a storage node of the intermediate result set;
an overhead calculation unit 530, configured to calculate a first time overhead for migrating the intermediate result set to a local, and calculate a second time overhead for executing the sub-query task by the storage node;
and a sub-query task allocation unit 540, configured to select, according to a comparison result between the first time cost and the second time cost, whether the computing node migrates the intermediate result set to the local and executes the sub-query task, or whether the storage node executes the sub-query task.
Optionally, the sub-query task allocating unit 540 is specifically configured to: calculating a ratio of the first time overhead and the second time overhead; when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task; and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
Optionally, the sub-query task allocation unit 540 is configured to compare the resource margin with a preset second threshold, and includes: comparing the disk margin of the storage node with a preset second threshold; or comparing the CPU allowance of the storage node with a preset second threshold; or comparing the memory allowance of the storage node with a preset second threshold; or comparing the query process allowance of the storage node with a preset second threshold.
Optionally, the cost calculating unit 530 is configured to, when calculating a first time cost for migrating the intermediate result set to the local, specifically:
recursively calculating a first time cost for transferring the intermediate result set to the local by adopting an exponential weighted average method; wherein a default value for the first time overhead is preconfigured.
Referring to fig. 6, the present invention further provides a distributed scheduling system of a multi-source data analysis engine, including a scheduling server 610 and a plurality of computing nodes 620;
the scheduling server 610 is configured to determine the scheduling indexes of the plurality of computing nodes, and allocate a query task to the computing node with the highest scheduling index;
the computing node with the highest scheduling index in the plurality of computing nodes 620 is used for receiving the query task; determining a sub-query task of the query task comprising an intermediate result set, and determining a storage node of the intermediate result set; calculating a first time cost of migrating the intermediate result set to a local site, and calculating a second time cost of executing the sub-query task by the storage node; and according to the comparison result of the first time cost and the second time cost, selecting whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task.
Optionally, when the computing node with the highest scheduling index is configured to determine, according to a comparison result between the first time cost and the second time cost, whether to migrate the intermediate result set to the local and execute the sub-query task by the computing node or to execute the sub-query task by the storage node, the computing node with the highest scheduling index is specifically configured to: calculating a ratio of the first time overhead and the second time overhead; when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task; and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to perform the various methods of the present invention according to instructions in the program code stored in the memory.
By way of example, and not limitation, computer readable media may comprise computer storage media and communication media. Computer-readable media includes both computer storage media and communication media. Computer storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of computer readable media.
It should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the method of the invention should not be construed to reflect the intent: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing inventive embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules or units or components of the apparatus in the examples invented herein may be arranged in an apparatus as described in this embodiment or alternatively may be located in one or more apparatuses different from the apparatus in this example. The modules in the foregoing examples may be combined into one module or may be further divided into multiple sub-modules.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features of the invention in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so invented, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature of the invention in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments described herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
Furthermore, some of the described embodiments are described herein as a method or combination of method elements that can be performed by a processor of a computer system or by other means of performing the described functions. A processor having the necessary instructions for carrying out the method or method elements thus forms a means for carrying out the method or method elements. Further, the elements of the apparatus embodiments described herein are examples of the following apparatus: the apparatus is used to implement the functions performed by the elements for the purpose of carrying out the invention.
As used herein, unless otherwise specified the use of the ordinal adjectives "first", "second", "third", etc., to describe a common object, merely indicate that different instances of like objects are being referred to, and are not intended to imply that the objects so described must be in a given sequence, either temporally, spatially, in ranking, or in any other manner.
While the invention has been described with respect to a limited number of embodiments, those skilled in the art, having benefit of this description, will appreciate that other embodiments can be devised which do not depart from the scope of the invention as described herein. Furthermore, it should be noted that the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the appended claims. The present invention is to be considered as illustrative and not restrictive in character, with the scope of the invention being indicated by the appended claims.

Claims (10)

1. A distributed scheduling method of a multi-source data analysis engine is characterized by comprising the following steps:
receiving a query task by a computing node with the highest scheduling index of the multi-source data analysis engine;
the computing node determining a sub-query task of the query task that includes an intermediate result set, and determining a storage node of the intermediate result set;
the compute node computing a first time cost to migrate the intermediate result set to local and computing a second time cost to execute the sub-query task by the storage node;
and the computing node selects whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task according to the comparison result of the first time overhead and the second time overhead.
2. The method of claim 1, wherein the computing node determining whether to migrate the intermediate result set locally and execute the sub-query task or to execute the sub-query task by the storage node based on a comparison of the first time cost and the second time cost comprises:
the computing node calculating a ratio of the first time cost and the second time cost;
when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task;
and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
3. The method of claim 2, wherein comparing the resource margin to a preset second threshold comprises:
comparing the disk margin of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
comparing the CPU allowance of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
comparing the memory allowance of the storage node with a preset second threshold; alternatively, the first and second electrodes may be,
and comparing the query process allowance of the storage node with a preset second threshold.
4. The method of claim 1, wherein determining a scheduling index for a multi-source data analytics engine comprises:
obtaining the disk margin, the CPU margin, the memory margin and the query process margin of each computing node of the multi-source data analysis engine;
and determining the scheduling index of each computing node according to the disk margin, the CPU margin, the memory margin and the query process margin.
5. The method of claim 1, wherein the compute node computes a first time cost of migrating the intermediate result set to local, comprising:
the computing node adopts an exponential weighted average method to recursively compute a first time cost for transferring the intermediate result set to the local; wherein a default value for the first time overhead is preconfigured.
6. A compute node apparatus of a multi-source data analytics engine, comprising:
the query task receiving unit is used for receiving a query task;
the query task analysis unit is used for determining a sub-query task comprising an intermediate result set of the query task and determining a storage node of the intermediate result set;
the cost calculation unit is used for calculating a first time cost for migrating the intermediate result set to the local and calculating a second time cost for executing the sub-query task by the storage node;
and the sub-query task allocation unit is used for selecting whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task according to the comparison result of the first time overhead and the second time overhead.
7. A readable storage medium having executable instructions thereon that, when executed, cause a computer to perform the method of any one of claims 1-5.
8. A computing device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors to perform the method recited in any of claims 1-5.
9. A distributed scheduling system of a multi-source data analysis engine is characterized by comprising a scheduling server and a plurality of computing nodes;
the scheduling server is used for determining the scheduling indexes of the plurality of computing nodes and distributing the query task to the computing node with the highest scheduling index;
the computing node with the highest scheduling index in the plurality of computing nodes is used for receiving the query task; determining a sub-query task of the query task comprising an intermediate result set, and determining a storage node of the intermediate result set; calculating a first time cost of migrating the intermediate result set to a local site, and calculating a second time cost of executing the sub-query task by the storage node; and according to the comparison result of the first time cost and the second time cost, selecting whether the computing node migrates the intermediate result set to the local and executes the sub-query task or the storage node executes the sub-query task.
10. The system according to claim 9, wherein the computing node with the highest scheduling index is configured to, when determining, according to a comparison result between the first time cost and the second time cost, whether to migrate the intermediate result set to the local by the computing node and execute the sub-query task, or to execute the sub-query task by the storage node, specifically:
calculating a ratio of the first time overhead and the second time overhead;
when the ratio is larger than a preset first threshold, calculating a resource margin of the storage node, comparing the resource margin with a preset second threshold, if the resource margin is larger than the preset second threshold, indicating the storage node to execute the sub-query task, and if the resource margin is not larger than the preset second threshold, permanently migrating the intermediate result set to the local and executing the sub-query task;
and when the ratio is not greater than a preset first threshold value, migrating the intermediate result set to the local and executing the sub-query task.
CN202010734766.8A 2020-07-28 2020-07-28 Distributed scheduling method, computing node and system of multi-source data analysis engine Active CN111625696B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010734766.8A CN111625696B (en) 2020-07-28 2020-07-28 Distributed scheduling method, computing node and system of multi-source data analysis engine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010734766.8A CN111625696B (en) 2020-07-28 2020-07-28 Distributed scheduling method, computing node and system of multi-source data analysis engine

Publications (2)

Publication Number Publication Date
CN111625696A true CN111625696A (en) 2020-09-04
CN111625696B CN111625696B (en) 2021-01-29

Family

ID=72260445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010734766.8A Active CN111625696B (en) 2020-07-28 2020-07-28 Distributed scheduling method, computing node and system of multi-source data analysis engine

Country Status (1)

Country Link
CN (1) CN111625696B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112511570A (en) * 2021-02-07 2021-03-16 浙江地芯引力科技有限公司 Internet of things data integrity checking system and method based on special chip
CN112948467A (en) * 2021-03-18 2021-06-11 北京中经惠众科技有限公司 Data processing method and device, computer equipment and storage medium
CN113836219A (en) * 2021-08-10 2021-12-24 浙江中控技术股份有限公司 Distributed data transfer scheduling system and method
WO2022110861A1 (en) * 2020-11-27 2022-06-02 苏州浪潮智能科技有限公司 Method and apparatus for data set caching in network training, device, and storage medium
CN114756629A (en) * 2022-06-16 2022-07-15 之江实验室 Multi-source heterogeneous data interaction analysis engine and method based on SQL
CN116679878A (en) * 2023-05-31 2023-09-01 珠海妙存科技有限公司 Flash memory data processing method and device, electronic equipment and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408106A (en) * 2014-11-20 2015-03-11 浙江大学 Scheduling method for big data inquiry in distributed file system
CN104424287A (en) * 2013-08-30 2015-03-18 深圳市腾讯计算机系统有限公司 Query method and query device for data
CN104871154A (en) * 2012-11-30 2015-08-26 亚马逊技术有限公司 Optimizing data block size for deduplication
US20180365291A1 (en) * 2017-06-16 2018-12-20 Nec Laboratories America, Inc. Optimizations for a behavior analysis engine
CN110866046A (en) * 2019-10-28 2020-03-06 北京大学 Extensible distributed query method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104871154A (en) * 2012-11-30 2015-08-26 亚马逊技术有限公司 Optimizing data block size for deduplication
CN104424287A (en) * 2013-08-30 2015-03-18 深圳市腾讯计算机系统有限公司 Query method and query device for data
CN104408106A (en) * 2014-11-20 2015-03-11 浙江大学 Scheduling method for big data inquiry in distributed file system
US20180365291A1 (en) * 2017-06-16 2018-12-20 Nec Laboratories America, Inc. Optimizations for a behavior analysis engine
CN110866046A (en) * 2019-10-28 2020-03-06 北京大学 Extensible distributed query method and device

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022110861A1 (en) * 2020-11-27 2022-06-02 苏州浪潮智能科技有限公司 Method and apparatus for data set caching in network training, device, and storage medium
CN112511570A (en) * 2021-02-07 2021-03-16 浙江地芯引力科技有限公司 Internet of things data integrity checking system and method based on special chip
CN112948467A (en) * 2021-03-18 2021-06-11 北京中经惠众科技有限公司 Data processing method and device, computer equipment and storage medium
CN112948467B (en) * 2021-03-18 2023-10-10 北京中经惠众科技有限公司 Data processing method and device, computer equipment and storage medium
CN113836219A (en) * 2021-08-10 2021-12-24 浙江中控技术股份有限公司 Distributed data transfer scheduling system and method
CN114756629A (en) * 2022-06-16 2022-07-15 之江实验室 Multi-source heterogeneous data interaction analysis engine and method based on SQL
CN114756629B (en) * 2022-06-16 2022-10-21 之江实验室 Multi-source heterogeneous data interaction analysis engine and method based on SQL
CN116679878A (en) * 2023-05-31 2023-09-01 珠海妙存科技有限公司 Flash memory data processing method and device, electronic equipment and readable storage medium
CN116679878B (en) * 2023-05-31 2024-04-19 珠海妙存科技有限公司 Flash memory data processing method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN111625696B (en) 2021-01-29

Similar Documents

Publication Publication Date Title
CN111625696B (en) Distributed scheduling method, computing node and system of multi-source data analysis engine
US11146502B2 (en) Method and apparatus for allocating resource
US8949558B2 (en) Cost-aware replication of intermediate data in dataflows
US11475006B2 (en) Query and change propagation scheduling for heterogeneous database systems
US9176805B2 (en) Memory dump optimization in a system
TW201820165A (en) Server and cloud computing resource optimization method thereof for cloud big data computing architecture
Che et al. A deep reinforcement learning approach to the optimization of data center task scheduling
Dai et al. An improved task assignment scheme for Hadoop running in the clouds
US11294930B2 (en) Resource scaling for distributed database services
WO2024016596A1 (en) Container cluster scheduling method and apparatus, device, and storage medium
Wang et al. An efficient and non-intrusive GPU scheduling framework for deep learning training systems
CN113391765A (en) Data storage method, device, equipment and medium based on distributed storage system
CN112291335A (en) Optimized task scheduling method in mobile edge calculation
Xia et al. Efficient data placement and replication for QoS-aware approximate query evaluation of big data analytics
CN112099937A (en) Resource management method and device
US20220413906A1 (en) Method, device, and program product for managing multiple computing tasks based on batch
US10540217B2 (en) Message cache sizing
US10628279B2 (en) Memory management in multi-processor environments based on memory efficiency
US20210374048A1 (en) Method, electronic device, and computer program product for storage management
Shang et al. A strategy for scheduling reduce task based on intermediate data locality of the MapReduce
US10664309B2 (en) Use of concurrent time bucket generations for scalable scheduling of operations in a computer system
US20220343209A1 (en) Method, device, and computer program product for managing machine learning model
US11797282B2 (en) Optimizing services deployment in a cloud computing environment
Wang et al. On optimal budget-driven scheduling algorithms for MapReduce jobs in the hetereogeneous cloud
US20170060935A1 (en) Distributed systems and methods for database management and management systems thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant