CN116126937A - Job scheduling method, job scheduling device, electronic equipment and storage medium - Google Patents

Job scheduling method, job scheduling device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116126937A
CN116126937A CN202211710569.8A CN202211710569A CN116126937A CN 116126937 A CN116126937 A CN 116126937A CN 202211710569 A CN202211710569 A CN 202211710569A CN 116126937 A CN116126937 A CN 116126937A
Authority
CN
China
Prior art keywords
job
job task
sql
task
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211710569.8A
Other languages
Chinese (zh)
Inventor
林佳龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dt Dream Technology Co Ltd
Original Assignee
Hangzhou Dt Dream Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dt Dream Technology Co Ltd filed Critical Hangzhou Dt Dream Technology Co Ltd
Priority to CN202211710569.8A priority Critical patent/CN116126937A/en
Publication of CN116126937A publication Critical patent/CN116126937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a job scheduling method, a job scheduling device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring a first job task to be scheduled; analyzing the first job task to determine the operation type of the first job task according to the operation time required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks with different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, and each job can be effectively completed on time.

Description

Job scheduling method, job scheduling device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a job scheduling method, a job scheduling device, an electronic device, and a storage medium.
Background
With the continuous development of the Internet age, the data explosively grows, and the traditional data warehouse construction based on the relational database can not meet the calculation requirement of the existing big data gradually. Aiming at the construction of a data warehouse of big data, the main technology Spark of the current mainstream technology carries out offline data processing.
In the related art, a SQL job is submitted to an SQL computing engine through Spark SQL, so that the influence of mutual competition of scheduled resources of different levels is frequently encountered, the job of a low level cannot be completed in a normal scheduling period, and meanwhile, the problems of high computing complexity, large data volume and the like of part of jobs also exist, so that the execution of a single job is blocked, and the normal scheduling of other jobs is influenced.
Disclosure of Invention
The present invention aims to solve at least one of the technical problems in the related art to some extent.
Therefore, a first object of the present invention is to provide a job scheduling method, so as to solve the problem that the normal scheduling of other jobs is affected due to the blocking of large jobs by executing different operation strategies according to the operation types of different tasks, and ensure that each job can be completed effectively on time.
A second object of the present invention is to provide a job scheduling apparatus.
A third object of the present invention is to propose an electronic device.
A fourth object of the present invention is to propose a non-transitory computer readable storage medium.
A fifth object of the invention is to propose a computer programme product.
To achieve the above object, an embodiment of a first aspect of the present invention provides a job scheduling method, including: acquiring a first job task to be scheduled; analyzing the first job task to determine the operation type of the first job task according to the operation time length required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute the first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to the computing cluster to operate in an isolated mode independent of the SQL engine.
According to the job scheduling method, the first job task to be scheduled is obtained; analyzing the first job task to determine the operation type of the first job task according to the operation time required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks with different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, and each job can be effectively completed on time.
To achieve the above object, an embodiment of a second aspect of the present invention provides a job scheduling apparatus, including: the first acquisition module is used for acquiring a first job task to be scheduled; the analysis module is used for analyzing the first job task to determine the operation type of the first job task according to the operation time length required by the first job task, wherein the operation type comprises isolation operation and SQL operation; the first scheduling module is used for scheduling SQL engines in the computing cluster to execute the first job task under the condition that the operation type is SQL operation; and the second scheduling module is used for packaging the first job task into an independent APP and scheduling the first job task to the computing cluster to run in an isolated mode independent of the SQL engine under the condition that the running type is isolated running.
The job scheduling device provided by the embodiment of the invention obtains the first job task to be scheduled; analyzing the first job task to determine the operation type of the first job task according to the operation time required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks with different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, and each job can be effectively completed on time.
To achieve the above object, an embodiment of a third aspect of the present invention provides an electronic device, including: a processor; a memory for storing the processor-executable instructions; wherein the processor is configured to execute the job scheduling method proposed by the first aspect of the present invention.
In order to achieve the above object, a fourth aspect of the present invention provides a non-transitory computer-readable storage medium, which when executed by a processor, enables an electronic device to perform the job scheduling method set forth in the first aspect of the present invention.
In order to achieve the above object, an embodiment of a fifth aspect of the present invention proposes a computer program product, which when executed by an instruction processor in the computer program product, enables program instructions to perform the job scheduling method proposed by the first aspect of the present invention.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings, in which:
fig. 1 is a flow chart of a job scheduling method according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another job scheduling method according to an embodiment of the present invention;
FIG. 3 is a flow chart of a job scheduling method according to an embodiment of the present invention;
FIG. 4 is a flow chart of a job scheduling method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a job scheduling device according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
The job scheduling method, apparatus, electronic device, and storage medium of the embodiments of the present invention are described below with reference to the accompanying drawings.
Fig. 1 is a flow chart of a job scheduling method according to an embodiment of the present invention.
With the continuous development of the Internet age, the data explosively grows, and the traditional data warehouse construction based on the relational database can not meet the calculation requirement of the existing big data gradually. Aiming at the construction of a data warehouse of big data, the main technology Spark of the current mainstream technology carries out offline data processing.
In the related art, a SQL job is submitted to an SQL computing engine through Spark SQL, so that the influence of mutual competition of scheduled resources of different levels is frequently encountered, the job of a low level cannot be completed in a normal scheduling period, and meanwhile, the problems of high computing complexity, large data volume and the like of part of jobs also exist, so that the execution of a single job is blocked, and the normal scheduling of other jobs is influenced.
In order to solve the problem, the embodiment of the invention provides a job scheduling method to determine the task operation type according to the task duration, and then allocate different operation strategies for the tasks of different operation types, thereby solving the problem that the normal scheduling of other jobs is affected due to large job blocking, and ensuring that each job task can be effectively completed on time, as shown in fig. 1, the job scheduling method comprises the following steps:
step 101, obtaining a first job task to be scheduled.
Before the introduction step 101, the following terms are introduced:
spark: a distributed cluster data processing framework is based on DAG (Directed Acyclic Graph ) memory computation and supports the characteristics of elastic expansion, big data computation, high performance and the like.
Spark SQL: the Spark is a Spark calculation module for structured data processing, and SQL grammar analysis and calculation capability are provided for the outside.
In the embodiment of the present invention, the process of the electronic device executing step 101 may be: and starting scheduling after a plurality of SQL job tasks are created, and determining the SQL job tasks to be scheduled by the workflow service node according to the sequence of submitting the SQL job tasks to the workflow service node, wherein the first job task is the SQL job task.
Step 102, analyzing the first job task to determine the operation type of the first job task according to the operation duration required by the first job task, wherein the operation type comprises isolation operation and SQL operation.
In the embodiment of the present invention, the operation type of the first job task includes: the method comprises the steps of isolation operation and SQL operation, wherein the isolation operation represents encapsulation of SQL jobs into independent APP and submitting the independent APP to Spark clusters for isolation execution, the SQL operation represents reservation of original format, and the SQL operation is submitted to an executed SQL grouping engine for calculation execution according to returned SQL grouping.
As a possible implementation manner, when a first job task to be scheduled is obtained, the SQL job analysis service node is called to analyze the first job task, so as to obtain an execution period required by the first job task, obtain execution time and related metadata information of the first job task, and determine that the operation type of the first job task is isolated operation if the execution time is long or the data volume is large, and the rest is SQL operation.
Wherein the execution period may be minutes, hours, days, years, etc.
Step 103, in the case that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute the first job task.
As a possible implementation manner, if the operation type of the first job task is SQL operation, the first job task retains an original data format, and the workflow service submits the first job task to a corresponding SQL engine group in the computing cluster according to the returned SQL engine group corresponding to the first job task to execute the first job task.
Step 104, under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine.
As a possible implementation manner, if the operation type of the first job task is isolated operation, the first job task is encapsulated into an independent APP, and the workflow service submits the first job task to a corresponding Spark APP module in the computing cluster for isolated execution.
In the embodiment of the present disclosure, after the execution of the current first job task is completed, the job task execution job scheduling method to be scheduled is continuously acquired.
In an embodiment of the present invention, a computing cluster includes: the SQL engine and Spark APP module are in SQL operation corresponding to the execution operation type, the Spark APP module is in isolation operation corresponding to the execution operation type, and when the first job task is executed, the SQL engine and the Spark APP module are dispatched to different modules in the computing cluster according to different operation types, so that the two operation types are separated, and the problem that normal dispatching of other jobs is affected due to large job blocking is solved.
The SQL engine and the SPARK APP module in the computing cluster respectively comprise a plurality of modules.
In summary, a first job task to be scheduled is obtained; analyzing the first job task to determine the operation type of the first job task according to the operation time required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks with different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, and each job task can be effectively completed on time.
In order to clearly illustrate the above embodiment, another job scheduling method is provided in this embodiment, and fig. 2 is a schematic flow chart of another job scheduling method provided in the embodiment of the present invention.
As shown in fig. 2, the job scheduling method may include the steps of:
step 201, a first job task to be scheduled is obtained.
Step 202, determining the operation duration required by the first job task according to the SQL operator related to the first job task and/or the data amount information in the related data table.
As a possible implementation manner, the job analysis service node determines a job cycle of the first job task, then analyzes data volume information in the SQL operator and/or the related data table related to the first job task according to the job cycle of the first job task, and estimates and obtains an operation duration required by executing the first job task once according to the data volume information in the related data table.
Step 203, determining the operation type of the first job task according to the operation duration.
As one possible implementation manner, in the case that the operation duration is greater than the threshold value, it is determined that the operation type is the isolated operation.
As a possible implementation manner, in the case that the operation duration is not greater than the threshold value, determining an operation period required by the first job task according to the execution period and the operation duration of the first job task; determining an available period according to the job tasks scheduled by the SQL engine; determining that the operation type of the first job task is SQL operation under the condition that the operation time period required by the first job task is within the available time period; and determining that the operation type of the first job task is isolated operation in the case that the operation period required by the first job task is not within the available period.
Further, determining the available period according to the job tasks scheduled by the SQL engine comprises: determining a first target group corresponding to the execution period from at least two groups according to the execution period of the first job task, wherein the first target group is used for scheduling an internal SQL engine to execute the first job task under the condition that the operation type is SQL operation; the available period is determined based on job tasks scheduled by the SQL engine in the first target packet.
The computing cluster comprises a plurality of SQL engines, and the SQL engines are divided into at least two groups; at least two of the groupings are used to perform different duty cycle tasks.
Step 204, determining a target SQL engine with the running period of the first job task being in the available period from at least two SQL engines of the first target group.
In an embodiment of the invention, the first target packet includes at least two SQL engines.
As one possible implementation manner, according to the running time and the execution period of the first job task, determining a corresponding first target packet, then determining an SQL engine in the first target packet, wherein the SQL engine is in an available period, and taking the SQL engine as a target SQL engine.
Further, in the case that the first target group fails to execute the task, the first job task is scheduled to the second target group to execute the task; wherein the job cycle of the task executed by the second target group is greater than the job cycle of the task executed by the first target group.
For example, because the complexity of the first job task is too high, the actual running time of the first job task is longer than the estimated running time, and the first job task cannot be executed after being scheduled to the first target group, so that the first job task can be scheduled to the second target group. Therefore, the first job task can be successfully executed.
In step 205, the scheduling object SQL engine performs a first job task.
In the embodiment of the invention, the workflow service node submits a first job task to a first target group of the computing cluster, and a target SQL engine in the first target group is scheduled to execute the first job task.
In step 206, in the case that the operation type is isolated operation, the first job task is encapsulated into an independent APP and scheduled to the computing cluster to operate in an isolated manner independent of the SQL engine.
It should be noted that, the descriptions of steps 201 and 206 may refer to other embodiments of the present invention, and will not be described in detail herein.
According to the business scheduling method, the first job task to be scheduled is obtained; determining the required operation time length of the first job task according to the SQL operator related to the first job task and/or the data volume information in the related data table; determining the operation type of the first job task according to the operation duration; determining a target SQL engine of which the operation period of the first job task is within the available period from at least two SQL engines of the first target group; scheduling the target SQL engine to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks of different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, meanwhile, the corresponding target groups are scheduled for different scheduling levels, the mutual influence of the tasks among different job periods is solved, and the fact that each job task can be effectively completed on time is guaranteed.
Fig. 3 is a flow chart of a job scheduling method according to an embodiment of the present invention. As shown in fig. 3, the Spark computing cluster includes: SQL engine group and Spark APP module.
In the embodiment of the invention, a workflow service node acquires a first job task to be scheduled, then the workflow service node schedules an SQL job analysis service node to analyze the first job task to obtain the operation type of a job task, and if the operation type is SQL operation, a target SQL engine in an SQL engine group in a computing cluster is scheduled to execute the first job task; and if the operation type is isolated operation, packaging the first job task into an independent APP and scheduling the first job task into a computing cluster to operate in an isolated mode independent of the SQL engine.
Fig. 4 is a flow chart of a job scheduling method according to an embodiment of the present invention. The steps illustrated in fig. 4 are a detailed description of the flow illustrated in fig. 3. As shown in fig. 4, the job scheduling method includes the steps of:
step 401, a user submits an SQL job and initiates job scheduling.
In step 402, the workflow service node starts scheduling execution.
Step 403, the workflow service node calls the job analysis service node to analyze the job to obtain a scheduling period of the first job task, and analyzes the SQL operator related to the first job task and the metadata information related to the related data table to evaluate and obtain the operation type of the first job task.
At step 404, the job analysis service node returns the run type to the workflow service.
In step 405, if the operation type is SQL operation, the workflow service node schedules the first job task to a target SQL engine in the SQL engine group in the computing cluster to execute the first job task.
In step 406, if the operation type is isolated operation, the workflow service node encapsulates the first job task into an independent APP and schedules the first job task to the computing cluster to operate in an isolated manner independent of the SQL engine.
In summary, a first job task to be scheduled is obtained; determining the required operation time length of the first job task according to the SQL operator related to the first job task and/or the data volume information in the related data table; determining the operation type of the first job task according to the operation duration; determining a target SQL engine of which the operation period of the first job task is within the available period from at least two SQL engines of the first target group; scheduling the target SQL engine to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks of different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, meanwhile, the corresponding target groups are scheduled for different scheduling levels, the mutual influence of the tasks among different job periods is solved, and the fact that each job task can be effectively completed on time is guaranteed.
In order to achieve the above embodiment, the present invention further provides a job scheduling device.
Fig. 5 is a schematic structural diagram of a job scheduling device according to an embodiment of the present invention.
As shown in fig. 5, the job scheduling apparatus includes: a first acquisition module 510, an analysis module 520, a first scheduling module 530, and a second scheduling module 540.
The first obtaining module 510 is configured to obtain a first job task to be scheduled;
an analysis module 520, configured to analyze the first job task to determine an operation type of the first job task according to an operation duration required by the first job task, where the operation type includes an isolation operation and an SQL operation;
a first scheduling module 530, configured to schedule an SQL engine in a computing cluster to execute the first job task if the operation type is SQL operation;
and the second scheduling module 540 is configured to encapsulate the first job task into an independent APP and schedule the first job task to the computing cluster to run in an isolated manner independent of the SQL engine if the running type is isolated.
In a possible implementation manner of the embodiment of the present invention, the analysis module 520 is specifically configured to determine, according to the SQL operator related to the first job task and/or data volume information in the related data table, a required operation duration of the first job task; and determining the operation type of the first job task according to the operation duration.
In a possible implementation manner of the embodiment of the present invention, the analysis module 520 is specifically configured to determine that the operation type is an isolated operation if the operation time length is greater than a threshold value.
In a possible implementation manner of the embodiment of the present invention, the analysis module 520 is specifically configured to determine, when the operation duration is not greater than a threshold, an operation period required for the first job task according to an execution period of the first job task and the operation duration; determining an available period according to the job tasks scheduled by the SQL engine; determining that the operation type of the first job task is SQL operation under the condition that the operation time period required by the first job task is within the available time period; and determining that the operation type of the first job task is isolated operation when the operation period required by the first job task is not within the available period.
In one possible implementation of an embodiment of the present invention, the computing cluster includes a plurality of the SQL engines, and the plurality of the SQL engines are divided into at least two groups; the at least two groupings are used to perform different duty cycle tasks; the analysis module 520 is specifically configured to determine, according to an execution period of the first job task, a first target packet corresponding to the execution period from the at least two packets, where the first target packet is configured to schedule an internal SQL engine to execute the first job task when the operation type is SQL operation; and determining the available period according to the job tasks scheduled by the SQL engine in the first target packet.
In a possible implementation manner of the embodiment of the present invention, the analysis module 520 is further configured to schedule the first job task to a second target packet execution task in a case that the first target packet execution task fails; the job cycle of the task executed by the second target group is larger than that of the task executed by the first target group.
In one possible implementation of an embodiment of the present invention, the first target packet includes at least two SQL engines; the first scheduling module 530 is specifically configured to determine, from at least two SQL engines of the first target packet, a target SQL engine in which an operation period of the first job task is within the available period; and scheduling the target SQL engine to execute the first job task.
It should be noted that the foregoing explanation of the embodiment of the job scheduling method is also applicable to the job scheduling apparatus of this embodiment, and will not be repeated here.
In summary, a first job task to be scheduled is obtained; analyzing the first job task to determine the operation type of the first job task according to the operation time required by the first job task, wherein the operation type comprises isolation operation and SQL operation; under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute a first job task; and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to a computing cluster to operate in an isolated mode independent of the SQL engine. Therefore, the task operation type is determined according to the task duration, and different operation strategies are allocated to the tasks with different operation types, so that the problem that normal scheduling of other jobs is affected due to large job blocking is solved, and each job task can be effectively completed on time.
In order to implement the above embodiment, the present invention further provides an electronic device, and fig. 6 is a schematic structural diagram of an electronic device provided in the embodiment of the present invention. The electronic device includes:
a memory 601, a processor 602, and a computer program stored on the memory 601 and executable on the processor 602.
The processor 602 implements the product function recommendation method provided in the above embodiment when executing the program.
Further, the electronic device further includes:
a communication interface 603 for communication between the memory 601 and the processor 602.
A memory 601 for storing a computer program executable on the processor 602.
The memory 601 may comprise a high-speed RAM memory or may further comprise a non-volatile memory (non-volatile memory), such as at least one disk memory.
And a processor 602, configured to implement the product function recommendation method according to the foregoing embodiment when executing the program.
If the memory 601, the processor 602, and the communication interface 603 are implemented independently, the communication interface 603, the memory 601, and the processor 602 may be connected to each other through a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, abbreviated ISA) bus, an external device interconnect (Peripheral Component, abbreviated PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, abbreviated EISA) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 6, but not only one bus or one type of bus.
Alternatively, in a specific implementation, if the memory 601, the processor 602, and the communication interface 603 are integrated on a chip, the memory 601, the processor 602, and the communication interface 603 may perform communication with each other through internal interfaces.
The processor 602 may be a central processing unit (Central Processing Unit, abbreviated as CPU) or an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC) or one or more integrated circuits configured to implement embodiments of the present invention.
In order to achieve the above-described embodiments, the embodiments of the present invention also propose a non-transitory computer-readable storage medium on which a computer program is stored, which when executed by a processor implements a job scheduling method as provided in the above-described embodiments.
In order to achieve the above embodiments, the embodiments of the present invention also provide a computer program product, which when executed by an instruction processor in the computer program product, implements the job scheduling method provided in the above embodiments.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, the different embodiments or examples described in this specification and the features of the different embodiments or examples may be combined and combined by those skilled in the art without contradiction.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and additional implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order from that shown or discussed, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present invention.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. As with the other embodiments, if implemented in hardware, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
Those of ordinary skill in the art will appreciate that all or a portion of the steps carried out in the method of the above-described embodiments may be implemented by a program to instruct related hardware, where the program may be stored in a computer readable storage medium, and where the program, when executed, includes one or a combination of the steps of the method embodiments.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing module, or each unit may exist alone physically, or two or more units may be integrated in one module. The integrated modules may be implemented in hardware or in software functional modules. The integrated modules may also be stored in a computer readable storage medium if implemented in the form of software functional modules and sold or used as a stand-alone product.
The above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, or the like. While embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives and variations may be made to the above embodiments by one of ordinary skill in the art within the scope of the invention.

Claims (16)

1. A job scheduling method, comprising the steps of:
acquiring a first job task to be scheduled;
analyzing the first job task to determine the operation type of the first job task according to the operation time length required by the first job task, wherein the operation type comprises isolation operation and SQL operation;
under the condition that the operation type is SQL operation, the SQL engine in the computing cluster is scheduled to execute the first job task;
and under the condition that the operation type is isolated operation, the first job task is packaged into an independent APP and is scheduled to the computing cluster to operate in an isolated mode independent of the SQL engine.
2. The method of claim 1, wherein analyzing the first job task to determine the type of operation of the first job task based on the length of operation required for the first job task comprises:
determining the required operation time length of the first job task according to the SQL operator related to the first job task and/or the data volume information in the related data table;
and determining the operation type of the first job task according to the operation duration.
3. The method of claim 2, wherein determining the type of operation of the first job task based on the length of operation comprises:
and under the condition that the operation time length is larger than a threshold value, determining that the operation type is isolated operation.
4. The method of claim 2, wherein determining the type of operation of the first job task based on the length of operation comprises:
determining an operation period required by the first job task according to the execution period of the first job task and the operation duration under the condition that the operation duration is not greater than a threshold value;
determining an available period according to the job tasks scheduled by the SQL engine;
determining that the operation type of the first job task is SQL operation under the condition that the operation time period required by the first job task is within the available time period;
and determining that the operation type of the first job task is isolated operation when the operation period required by the first job task is not within the available period.
5. The method of claim 4, wherein the computing cluster comprises a plurality of the SQL engines divided into at least two groupings; the at least two groupings are used to perform different duty cycle tasks;
the determining the available period according to the job tasks scheduled by the SQL engine comprises the following steps:
determining a first target group corresponding to the execution period from the at least two groups according to the execution period of the first job task, wherein the first target group is used for scheduling an internal SQL engine to execute the first job task under the condition that the operation type is SQL operation;
and determining the available period according to the job tasks scheduled by the SQL engine in the first target packet.
6. The method as recited in claim 5, further comprising:
scheduling the first job task to a second target group execution task under the condition that the first target group execution task fails;
the job cycle of the task executed by the second target group is larger than that of the task executed by the first target group.
7. The method of claim 5, wherein the first target packet comprises at least two SQL engines; and under the condition that the operation type is SQL operation, scheduling SQL engines in a computing cluster to execute the first job task, wherein the method comprises the following steps:
determining a target SQL engine of which the operation period of the first job task is within the available period from at least two SQL engines of the first target group;
and scheduling the target SQL engine to execute the first job task.
8. A job scheduling device, comprising the steps of:
the first acquisition module is used for acquiring a first job task to be scheduled;
the analysis module is used for analyzing the first job task to determine the operation type of the first job task according to the operation time length required by the first job task, wherein the operation type comprises isolation operation and SQL operation;
the first scheduling module is used for scheduling SQL engines in the computing cluster to execute the first job task under the condition that the operation type is SQL operation;
and the second scheduling module is used for packaging the first job task into an independent APP and scheduling the first job task to the computing cluster to run in an isolated mode independent of the SQL engine under the condition that the running type is isolated running.
9. The apparatus according to claim 8, wherein the analysis module is configured, in particular,
determining the required operation time length of the first job task according to the SQL operator related to the first job task and/or the data volume information in the related data table;
and determining the operation type of the first job task according to the operation duration.
10. The device according to claim 9, wherein the analysis module is in particular adapted to,
and under the condition that the operation time length is larger than a threshold value, determining that the operation type is isolated operation.
11. The device according to claim 9, wherein the analysis module is in particular adapted to,
determining an operation period required by the first job task according to the execution period of the first job task and the operation duration under the condition that the operation duration is not greater than a threshold value;
determining an available period according to the job tasks scheduled by the SQL engine;
determining that the operation type of the first job task is SQL operation under the condition that the operation time period required by the first job task is within the available time period;
and determining that the operation type of the first job task is isolated operation when the operation period required by the first job task is not within the available period.
12. The apparatus of claim 11, wherein the computing cluster comprises a plurality of the SQL engines divided into at least two groupings; the at least two groupings are used to perform different duty cycle tasks;
the analysis module is used for analyzing the data of the data, in particular,
determining a first target group corresponding to the execution period from the at least two groups according to the execution period of the first job task, wherein the first target group is used for scheduling an internal SQL engine to execute the first job task under the condition that the operation type is SQL operation;
and determining the available period according to the job tasks scheduled by the SQL engine in the first target packet.
13. The apparatus of claim 12, wherein the analysis module is further configured to,
scheduling the first job task to a second target group execution task under the condition that the first target group execution task fails;
the job cycle of the task executed by the second target group is larger than that of the task executed by the first target group.
14. The apparatus of claim 12, wherein the first target packet comprises at least two SQL engines; the first scheduling module is specifically configured to,
determining a target SQL engine of which the operation period of the first job task is within the available period from at least two SQL engines of the first target group;
and scheduling the target SQL engine to execute the first job task.
15. An electronic device, comprising:
memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the job scheduling method according to any one of claims 1-7 when executing the program.
16. A non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the job scheduling method according to any one of claims 1 to 7.
CN202211710569.8A 2022-12-29 2022-12-29 Job scheduling method, job scheduling device, electronic equipment and storage medium Pending CN116126937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211710569.8A CN116126937A (en) 2022-12-29 2022-12-29 Job scheduling method, job scheduling device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211710569.8A CN116126937A (en) 2022-12-29 2022-12-29 Job scheduling method, job scheduling device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116126937A true CN116126937A (en) 2023-05-16

Family

ID=86294973

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211710569.8A Pending CN116126937A (en) 2022-12-29 2022-12-29 Job scheduling method, job scheduling device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116126937A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302452A (en) * 2023-05-18 2023-06-23 苏州浪潮智能科技有限公司 Job scheduling method, system, device, communication equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116302452A (en) * 2023-05-18 2023-06-23 苏州浪潮智能科技有限公司 Job scheduling method, system, device, communication equipment and storage medium
CN116302452B (en) * 2023-05-18 2023-08-22 苏州浪潮智能科技有限公司 Job scheduling method, system, device, communication equipment and storage medium

Similar Documents

Publication Publication Date Title
EP2894564A1 (en) Job scheduling based on historical job data
CN111625331B (en) Task scheduling method, device, platform, server and storage medium
CN109992366B (en) Task scheduling method and task scheduling device
CN108241539B (en) Interactive big data query method and device based on distributed system, storage medium and terminal equipment
CN116126937A (en) Job scheduling method, job scheduling device, electronic equipment and storage medium
CN113010289A (en) Task scheduling method, device and system
CN111680085A (en) Data processing task analysis method and device, electronic equipment and readable storage medium
CN108415765B (en) Task scheduling method and device and intelligent terminal
US20080271041A1 (en) Program processing method and information processing apparatus
CN112148481B (en) Method, system, equipment and medium for executing simulation test task
CN116932224A (en) Big data function resource consumption evaluation method and device
CN113127179A (en) Resource scheduling method and device, electronic equipment and computer readable medium
CN111143063A (en) Task resource reservation method and device
CN113051005B (en) Loading method and device
CN115373829A (en) Method, device and system for scheduling CPU (Central processing Unit) resources
CN111880803B (en) Software construction method and device applied to multiple platforms
CN111124834B (en) Access method and device for monitoring data in cloud computing environment and computer equipment
CN111309475B (en) Detection task execution method and equipment
US20210263826A1 (en) Data processing system performance monitoring
CN113792079A (en) Data query method and device, computer equipment and storage medium
CN108920722B (en) Parameter configuration method and device and computer storage medium
CN110888741A (en) Resource scheduling method and device for application container, server and storage medium
JP2021039666A (en) Core allocation device and core allocation method
CN110365342A (en) Waveform decoder method and device
CN115081233B (en) Flow simulation method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination