CN111552547A - Job processing method and device and computer equipment - Google Patents

Job processing method and device and computer equipment Download PDF

Info

Publication number
CN111552547A
CN111552547A CN202010319530.8A CN202010319530A CN111552547A CN 111552547 A CN111552547 A CN 111552547A CN 202010319530 A CN202010319530 A CN 202010319530A CN 111552547 A CN111552547 A CN 111552547A
Authority
CN
China
Prior art keywords
job
processing
processed
steps
computing resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010319530.8A
Other languages
Chinese (zh)
Inventor
张健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202010319530.8A priority Critical patent/CN111552547A/en
Publication of CN111552547A publication Critical patent/CN111552547A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration

Abstract

The embodiment of the disclosure provides a job processing method and device and computer equipment, relates to the technical field of cloud computing data processing, and solves the technical problem that the job processing efficiency is low due to the fact that the utilization rate of computing resources is low in the job processing process. The method comprises the following steps: acquiring a job to be processed; splitting the job to be processed into a plurality of job steps based on the execution phase of the job to be processed; and respectively processing the operation steps by utilizing a plurality of different computing resources according to the computing resources required for processing each operation step to obtain a final processing result of the operation to be processed.

Description

Job processing method and device and computer equipment
Technical Field
The present application relates to the technical field of cloud computing data processing, and in particular, to a job processing method and apparatus, and a computer device.
Background
The purpose of data processing includes deriving the required data from a large amount of data. Job (job) processing also pertains to data processing, where a job refers to data to be processed that is handed by a computer operator to an operating system in some operating systems.
Currently, in the process of job processing, computing resources such as a Central Processing Unit (CPU), a memory, a disk, a network, etc. are required to process a job to obtain a job processing result.
However, in the conventional job processing method, the utilization of the computing resources is insufficient in the process of job processing, and the processing efficiency of job processing is low.
Disclosure of Invention
The invention aims to provide a job processing method, a job processing device and computer equipment, which are used for solving the technical problem of low job processing efficiency caused by low utilization rate of computing resources in a job processing process.
In a first aspect, an embodiment of the present application provides a job processing method, where the method includes:
acquiring a job to be processed;
splitting the job to be processed into a plurality of job steps based on the execution phase of the job to be processed;
and respectively processing the operation steps by utilizing a plurality of different computing resources according to the computing resources required for processing each operation step to obtain a final processing result of the operation to be processed.
In a possible implementation, the step of obtaining a final processing result of the job to be processed by respectively processing the job steps using a plurality of different computing resources according to the computing resources required for processing each job step includes:
and circularly performing the following steps until a final processing result of the operation to be processed is obtained:
processing the current operation step by using the current computing resource to obtain a data set for data exchange between the current operation step and the next operation step;
determining a next computing resource according to the computing resource required for processing the next job step;
scheduling the current job step exchange dataset and the next job step to the next computing resource to process the next job step with the next computing resource based on the current job step exchange dataset.
In one possible implementation, the step of scheduling the next said job step to the next said computing resource comprises:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
and scheduling the next target operation step to the next computing resource.
In one possible implementation, the storage mode of each job step-to-job step exchange data set comprises any one of the following items:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
In one possible implementation, the step of determining a next said computing resource based on said computing resources required to process a next said job step comprises:
and adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
In one possible implementation, the historical processing result includes historical processing time consumption and historical processing resource utilization; the step of adjusting the computing resources required for processing the next job step based on the historical processing result of the job step includes:
and when the historical processing time consumption and the historical processing resource utilization rate of the operation step exceed a preset range, changing the computing resource required for processing the next operation step, the scheduling mode of the computing resource and the data exchange mode among the operation steps so as to enable the processing time consumption and the processing resource utilization rate of the next operation step to be within the preset range.
In one possible implementation, the number of the jobs to be processed is multiple;
each of the computing resources is configured to process the job step in at least one of the jobs to be processed.
In one possible implementation, each of the computing resources corresponds to an execution state, and the execution state is used for representing the execution state of the computing resource for processing each of the job steps.
In one possible implementation, the processing of the job to be processed is completed through the level control of the job step; the control content of the level of the job step comprises any one or more of the following items:
instructions for pause, cancel, restart, stop, and rerun of the job step.
In a second aspect, there is provided a job processing apparatus including:
the acquisition module is used for acquiring the operation to be processed;
the splitting module is used for splitting the job to be processed into a plurality of job steps based on the execution stage of the job to be processed;
and the processing module is used for respectively processing the operation steps by utilizing a plurality of different computing resources according to the computing resources required for processing each operation step to obtain the final processing result of the operation to be processed.
In one possible implementation, the processing module includes:
the processing submodule is used for processing the current operation step by using the current computing resource to obtain a data set for performing data exchange between the current operation step and the next operation step;
a determining submodule, configured to determine a next computing resource according to the computing resource required for processing a next job step;
and the scheduling submodule is used for scheduling the current job step-by-step exchange data set and the next job step to the next computing resource so as to process the next job step by using the next computing resource based on the current job step-by-step exchange data set.
In one possible implementation, the scheduling sub-module is specifically configured to:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
and scheduling the next target operation step to the next computing resource.
In one possible implementation, the storage mode of each job step-to-job step exchange data set comprises any one of the following items:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
In one possible implementation, the determining submodule is specifically configured to:
and adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
In one possible implementation, the historical processing result includes historical processing time consumption and historical processing resource utilization; the determination submodule is further configured to:
and when the historical processing time consumption and the historical processing resource utilization rate of the operation step exceed a preset range, changing the computing resource required for processing the next operation step, the scheduling mode of the computing resource and the data exchange mode among the operation steps so as to enable the processing time consumption and the processing resource utilization rate of the next operation step to be within the preset range.
In one possible implementation, the number of the jobs to be processed is multiple;
each of the computing resources is configured to process the job step in at least one of the jobs to be processed.
In one possible implementation, each of the computing resources corresponds to an execution state, and the execution state is used for representing the execution state of the computing resource for processing each of the job steps.
In one possible implementation, the processing of the job to be processed is completed through the level control of the job step; the control content of the level of the job step comprises any one or more of the following items:
instructions for pause, cancel, restart, stop, and rerun of the job step.
In a third aspect, an embodiment of the present application further provides a computer device, including a memory and a processor, where the memory stores a computer program executable on the processor, and the processor implements the method of the first aspect when executing the computer program.
In a fourth aspect, this embodiment of the present application further provides a computer-readable storage medium storing machine executable instructions, which, when invoked and executed by a processor, cause the processor to perform the method of the first aspect.
The embodiment of the application brings the following beneficial effects:
the job processing method, the job processing device and the computer equipment provided by the embodiment of the application can divide the job to be processed into a plurality of job steps based on the execution stage of the job to be processed after the job to be processed can be obtained, then the job steps are respectively processed by utilizing a plurality of different computing resources according to the computing resources required for processing each job step, so as to obtain the final processing result of the job to be processed, in the scheme, the job to be processed is divided into a plurality of job steps according to the processing stage, the job steps are processed by utilizing different computing resources according to the computing resources required for processing each job step, so that each job step obtains more proper computing resources to process the job steps, the more proper computing resources can be matched according to the required resource characteristics of each job step, and the utilization rate of each computing resource is improved, and the processing efficiency of the whole operation to be processed is improved, and the technical problem of low operation processing efficiency caused by low utilization rate of computing resources in the operation processing process is solved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the detailed description of the present application or the technical solutions in the prior art, the drawings needed to be used in the detailed description of the present application or the prior art description will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a schematic flowchart of a job processing method according to an embodiment of the present application;
fig. 2 is a schematic diagram of another flowchart of a job processing method according to an embodiment of the present application;
fig. 3 is an example of a data exchange process in a job processing method according to an embodiment of the present application;
FIG. 4 is a schematic structural diagram of a basic architecture of a distributed batch processing system according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of a job processing apparatus according to an embodiment of the present application;
fig. 6 is a schematic structural diagram illustrating a computer device provided in an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "comprising" and "having," and any variations thereof, as referred to in the embodiments of the present application, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
At present, in the traditional financial industry, a batch processing system plays a key core role in a bank core system. The role of the batch system may include: data exchange with peripheral systems, such as credit management, financial management; clearing in the system, such as deposit, loan, withholding clearing for the generation of a credit, and the like; internal business processing, such as interest counting, interest bearing, lifting, automatic unloading and the like; checking account management, such as generating a third-party checking account file and the like; and sending the data to other platforms, such as registering an accounting book, generating a financial statement and the like.
The batch processing system calculates based on the benchmark data of the previous transaction day, provides data and environment required by the opening of the next transaction day, and the performance and the success of the batch processing system are directly related to the opening time of the next transaction day, namely the banking service starting time.
Meanwhile, in the process of transferring the financial information system, the batch processing system also transfers from centralized batch processing to open or distributed batch processing, wherein the transfer involves the contents of job scheduling, cache utilization, parallel computing and the like. The existing batch processing comprises the following methods:
the traditional centralized batch processing scheme utilizes high-performance and high-reliability machines to perform centralized batch processing calculation, such as large machines, small machines matched with high-performance hardware and the like, and is characterized in that the batch processing operation is intensively scheduled and managed by utilizing the high performance and high reliability of resources;
the distributed batch processing scheme based on the database realizes the concurrent scheduling of batch processing jobs in a distributed environment, and the data exchange and calculation of job processing are carried out based on the database, and is characterized in that the job scheduling is concurrent, the data management is centralized, and the requirement on the performance of the database is higher;
the distributed batch processing scheme with diversified data realizes the concurrent scheduling of batch processing jobs in a distributed environment, the data exchange of job processing can be based on a big data computing component, a cache component and a database, and the distributed batch processing scheme is characterized in that the job scheduling takes a job as a unit, and the computation in the job can use the computing capacity and the cache technology of the big data as required, thereby improving the job execution efficiency.
The above-mentioned existing batch processing methods have many disadvantages, including: the traditional centralized batch processing scheme only depends on high-performance and high-reliability hardware resources, the cost is high, and the resource idle rate is high; in the distributed batch processing scheme based on the database, all calculation and data exchange depend on the database, the performance requirement on the database is very high, and the distributed batch processing scheme is only suitable for scenes with small data volume and small batch processing operation scale; for a distributed processing scheme with diversified data, the distributed processing scheme is flexibly allocated from the aspects of computing capacity and a data exchange mode, but scheduling takes a job as a basic unit, but different jobs and different execution stages of the same job have different requirements (such as a central processing unit, a memory, a disk, a network and the like) on resources, so that the utilization rate of the resources by taking the job as the basic unit for scheduling is insufficient, the capacity of a distributed environment cannot be fully exerted, and the job processing efficiency is low.
Based on this, the embodiments of the present application provide a job processing method, an apparatus, and a computer device, by which the technical problem of low job processing efficiency due to low utilization rate of computing resources in a job processing process can be solved.
Embodiments of the present invention are further described below with reference to the accompanying drawings.
Fig. 1 is a flowchart illustrating a job processing method according to an embodiment of the present application. As shown in fig. 1, the method includes:
step S110, a job to be processed is acquired.
The job to be processed may be a job to be processed by the batch processing system. The batch processing method may be a variety of methods, such as a distributed batch processing method, a centralized batch processing method, and the like.
Step S120, splitting the job to be processed into a plurality of job steps based on the execution phase of the job to be processed.
The execution stage refers to each processing stage that can be divided from the whole processing procedure of the job to be processed, and may also be understood as a processing step. A job to be processed may be divided into several parts, each part being a job step.
For example, as shown in fig. 2, job to be processed joba 1 is split into three job steps of step1, step2 and step3 of joba 1 according to the execution phase; splitting a job to be processed job2 into two job steps of step1 and step2 of job2 according to the execution stage of the job to be processed; the job to be processed job3 is split into three job steps of step1, step2 and step3 of jobb 3 according to the execution stage of the job to be processed job.
Step S130, according to the computing resource required for processing each job step, respectively processing the job step by using a plurality of different computing resources, and obtaining a final processing result of the job to be processed.
In practical applications, calculations within a job step may use big data calculation components. Different job steps can be handled by different computing resources. As shown in fig. 2, the plurality of different computing resources may be computing resources used by a plurality of different execution units, such as execution unit 1, execution unit 2, … …, execution unit N, and the like, during execution. Each execution unit is in a thread pool (called thread pool for short) mode in the form of multi-thread processing.
The operation to be processed is divided into a plurality of operation steps according to the processing stage, different computing resources are utilized to process the operation steps according to the computing resources required by processing each operation step, so that each operation step obtains more targeted and more appropriate computing resources to process the operation, the more appropriate computing resources can be matched according to the required resource characteristics of each operation step, the processing efficiency of the whole operation to be processed is improved, and the utilization rate of each computing resource is also improved.
In the embodiment of the application, the job to be processed can be divided into the multi-stage job steps, the demand of each job step on the computing resources is different, and the method provided by the embodiment of the application can distribute the job steps to specific computing resources according to the resource demand of each job step.
The above steps are described in detail below.
In some embodiments, different job steps may be allocated to different computing resources for processing by scheduling, to more efficiently schedule job steps to the appropriate resources. As an example, the step S130 may include the steps of:
and circularly performing the following steps until a final processing result of the job to be processed is obtained:
step a), processing the current operation step by using the current computing resource to obtain a data set for data exchange between the current operation step and the next operation step;
step b), determining the next computing resource according to the computing resource required by the next operation step;
and c), scheduling the current job step-by-step exchange data set and the next job step to the next computing resource so as to process the next job step by using the next computing resource based on the current job step-by-step exchange data set.
Illustratively, as shown in fig. 2, in the process of scheduling a plurality of job steps to different execution units, the basic unit of scheduling is the job step, and when an execution unit is selected, a suitable resource is obtained through resource scheduling, so as to obtain a suitable resource through resource scheduling. For example, after receiving a job, the scheduling layer schedules the job to different execution units in a global scope, that is, schedules different execution stages of the job to different execution units, and invokes different computing resources to perform computation, thereby implementing asynchronous concurrent scheduling. Of course, in practical application, the job execution efficiency and the resource utilization rate can also be improved by matching with the resource scheduling model.
It should be noted that there is a mutual connection between the respective job steps during the processing of the job to be processed. For example, the processing result of the previous job step may be input as the current job step, the processing result of the current job step may be input as the next job step, and the processing results of these respective job steps may be used as the data set for exchanging data between job steps, i.e., the data set for exchanging data between the current job step and the next job step.
In the embodiment of the application, the separation between the scheduling of the job steps and the execution of the job steps is realized, and the high-concurrency job step scheduling can be realized by using low-cost hardware in a distributed environment. Through linkage of job step scheduling and resource scheduling, the job step can be more effectively scheduled to appropriate resources.
Based on the step a), the step b) and the step c), the embodiment of the application can also support the priority scheduling job step. As an example, the step c) may include the steps of:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
the next target job step is scheduled to the next computing resource.
In practical applications, the job step scheduling may be performed according to the priority of the job step, for example, the priority scheduling may be supported in cooperation with a priority queue.
The next target operation step is scheduled according to the priority sequence among the operation steps, so that the operation steps can be scheduled according to the priority of the operation steps during scheduling, and the scheduling processing process of the operation steps is more reasonable.
Based on the steps a), b) and c), different storage modes can be selected for data exchange among the job steps according to the data processing characteristics of the job steps. As an example, the storage mode of each data set exchanged among the job steps comprises any one of the following items:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
For example, if the storage mode of the inter-job-step exchange data set between job steps is a cache mode, the inter-job-step exchange data set may be cached in a key-value (KV) storage manner, so as to improve the concurrence efficiency of job steps.
In practical applications, as shown in fig. 3, the data exchange between the job steps may utilize caching technology, message queues, relational databases, KV databases, file systems, distributed storage, and the like. For example, in the data exchange form and the data exchange mode between job steps, small files that are read and written concurrently may be exchanged using a Network Attached Storage (NAS) or a distributed file system, and small files that are smaller than 10K may also be stored using KV. For large file concurrent computation, spark/flink + hdfs may be used for computation. For critical result information for a batch job, a database supporting strong consistency may be used for storage. Streaming and asynchronous computations can also be supported during data exchange, for example, using flink + kafka.
Through a plurality of different storage modes which can be selected, the exchange data set between the operation steps can be stored more pertinently and more effectively, so that the concurrence efficiency of the operation steps is improved.
Based on the above steps a), b) and c), the resource scheduling policy and job step data exchange policy can be dynamically adjusted by using the operation result of the historical job. As an example, the step b) may include the steps of:
and d), adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
In the embodiment of the application, the allocation of the resource demand side of the operation steps can be adjusted by utilizing the execution conditions of the historical operation steps, such as time consumption, resource utilization rate and the like, and the data exchange mode among the operation steps can also be adjusted directly on the basis of the running result statistics of the historical batch processing operation, so that the data exchange mode among the operation steps can be dynamically adjusted.
Based on the step d), the historical processing result comprises historical processing time consumption and historical processing resource utilization rate; the step d) may include the steps of:
when the time consumed by the historical processing of the operation step and the utilization rate of the historical processing resources exceed the preset range, the calculation resources required for processing the next operation step, the scheduling mode of the calculation resources and the data exchange mode between the operation steps are changed, so that the time consumed by the processing of the next operation step and the utilization rate of the processing resources are within the preset range.
The calculation resources required for processing the next operation step, the scheduling mode of the calculation resources and the data exchange mode between the operation steps are changed by referring to the historical processing time consumption and the historical processing resource utilization rate of the operation steps, the processing process of the current operation step can be perfected by referring to the historical processing result of each operation step, and the processing process of each operation step is perfected continuously and is more reasonable.
Based on the steps a), b) and c), the number of the jobs to be processed is a plurality; each computing resource is for processing a job step in at least one job to be processed.
For example, as shown in fig. 2, the number of jobs to be processed is three, i.e., job to be processed job1, job to be processed job2, and job to be processed job 3. The execution unit 1, the execution unit 2, and the execution unit 3 may execute job steps corresponding to a plurality of jobs to be processed, respectively. Of course, the independent resource scheduling system can scale up and scale down the computing resources corresponding to the execution units as needed to adapt to the number of job steps to be processed.
By processing the job steps of a plurality of jobs to be processed, the processing efficiency of the jobs to be processed can be improved, and the processing efficiency of each computing resource and the utilization rate of each computing resource can be improved.
In some embodiments, the system may monitor, in real-time, the execution status of each computing resource processing the various job steps. As one example, each computing resource corresponds to an execution state, which is used to represent the execution state of the computing resource for processing each job step.
For example, as shown in FIG. 2, the job status table records the status of the job execution phase. When the execution unit has a local failure, other execution units can take over the operation step and continue to execute the processing procedure of the operation step. By calculating the operation state of the resource, the real-time processing condition of each operation step can be obtained, and the accident condition in the operation processing process can be dealt with in time.
In some embodiments, the management of the job to be processed is job-level management, rather than management of the entire job level. As an example, the processing procedure of the job to be processed is completed through the management and control of the level of the job step; the control content of the level of the job step comprises any one or more of the following items: instructions for pause, cancel, restart, stop, and rerun of a job step.
In the embodiment of the application, scheduling and control of the job step level, such as pause, run, stop, rerun and the like of the job step level, are supported, so that the job execution efficiency and the resource utilization rate are improved.
A basic architecture of a distributed batch processing system corresponding to the job processing method provided by the embodiment of the present application may be as shown in fig. 4. The data exchange form between the Current operation Step (Job Current Step) and the Next operation Step (Job Next Step) can comprise distributed storage (node storage), relational database storage (storage unit DB), KV database storage and message queue storage; the data exchange schema may include streaming as well as batch processing.
The access layer for batch processing comprises job management, job management and control, safety management and the like, wherein the job management comprises job buffering, job current limiting, job distribution and the like, the job management and control comprises operation commands such as job pause, restart, cancel, stop, rerun and the like, a specific management and control object is a job level or a job step level, and the safety management comprises authority authentication, user authentication, a black and white list and the like.
The scheduling layer for batch processing comprises metadata management, job step scheduling and resource scheduling, wherein the metadata management mainly comprises job execution states, job step operation states, resource metadata and the like in a job life cycle, the job step scheduling is to divide a job into a plurality of stages and schedule the plurality of stages to different execution units for operation, appropriate resources are obtained through resource scheduling when the execution units are selected, in addition, scheduling can be carried out according to the priority of the job step during the job step scheduling, and the resource scheduling comprises a resource acquisition and resource scheduling model, is in charge of management of global resources and provides resource support for the job step operation.
For the execution layer of batch processing, the execution layer is a carrier for the real operation of the operation step, and in the process of operating the operation step, the resource scheduling is also utilized to obtain the public resource. For the batch resource layer, the batch resource layer provides the overall resources of the distributed batch system, including resource partitioning, classification, dynamic resource scaling, etc. For configuration management, configuration management includes resource configuration, job run parameter configuration, scheduling policy configuration, static parameter configuration, security configuration, and the like. The monitoring management comprises resource monitoring, operation step monitoring, access layer, scheduling layer and execution layer instance monitoring.
As another implementation manner of the embodiment of the present application, for a basic architecture of a distributed batch processing system corresponding to the job processing method provided in the embodiment of the present application, a spring-batch lightweight batch processing framework may be used as a basis, a scheduling framework is integrated, unified scheduling, management, control and monitoring capabilities are provided, and meanwhile, a security control capability of a batch job is realized by combining rights management and access control. In addition, in a batch processing scene with large data volume and high concurrency, a uniform access layer and a scheduling layer need to be realized, and the spring-batch is taken as an operation management component in an independent partition, so that the pressure of reducing the spring-batch is achieved; the spring-batch needs to support priority scheduling in coordination with the priority queue.
Fig. 5 provides a schematic diagram of a job processing apparatus. As shown in fig. 5, the job processing apparatus 500 includes:
an obtaining module 501, configured to obtain a job to be processed;
a splitting module 502, configured to split a job to be processed into multiple job steps based on an execution phase of the job to be processed;
the processing module 503 is configured to process the job steps respectively by using a plurality of different computing resources according to the computing resources required for processing each job step, so as to obtain a final processing result of the job to be processed.
In some embodiments, the processing module comprises:
the processing submodule is used for processing the current operation step by using the current computing resource to obtain a data set for performing data exchange between the current operation step and the next operation step;
a determining submodule, configured to determine a next computing resource according to the computing resource required for processing a next job step;
and the scheduling submodule is used for scheduling the current job step-by-step exchange data set and the next job step to the next computing resource so as to process the next job step by using the next computing resource based on the current job step-by-step exchange data set.
In some embodiments, the scheduling submodule is specifically configured to:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
and scheduling the next target operation step to the next computing resource.
In some embodiments, the storage mode of each of the job step-to-job step exchange data sets between job steps includes any one of:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
In some embodiments, the determination submodule is specifically configured to:
and adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
In some embodiments, the historical processing results include historical processing time consumption and historical processing resource utilization; the determination submodule is further configured to:
and when the historical processing time consumption and the historical processing resource utilization rate of the operation step exceed a preset range, changing the computing resource required for processing the next operation step, the scheduling mode of the computing resource and the data exchange mode among the operation steps so as to enable the processing time consumption and the processing resource utilization rate of the next operation step to be within the preset range.
In some embodiments, the number of jobs of the job to be processed is plural;
each of the computing resources is configured to process the job step in at least one of the jobs to be processed.
In some embodiments, each of the computing resources corresponds to an execution state, and the execution state is used for representing the execution state of the computing resource for processing each of the job steps.
In some embodiments, the processing of the job to be processed is completed through the control of the level of the job step; the control content of the level of the job step comprises any one or more of the following items:
instructions for pause, cancel, restart, stop, and rerun of the job step.
The job processing apparatus provided by the embodiment of the present application has the same technical features as the job processing method provided by the above embodiment, so that the same technical problems can be solved, and the same technical effects can be achieved.
As shown in fig. 6, an embodiment of the present application provides a computer device 600, including: a processor 601, a memory 602 and a bus, wherein the memory 602 stores machine-readable instructions executable by the processor 601, when the computer device runs, the processor 601 and the memory 602 communicate with each other through the bus, and the processor 601 executes the machine-readable instructions to execute the steps of the job processing method.
Specifically, the memory 602 and the processor 601 can be general-purpose memories and processors, which are not specifically limited herein, and the job processing method can be executed when the processor 601 executes a computer program stored in the memory 602.
The processor 601 may be an integrated circuit chip having signal processing capability. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 601. The Processor 601 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 602, and the processor 601 reads the information in the memory 602 and completes the steps of the method in combination with the hardware thereof.
Corresponding to the above job processing method, the present application further provides a computer readable storage medium storing machine executable instructions, which, when invoked and executed by a processor, cause the processor to execute the steps of the above job processing method.
The job processing apparatus provided in the embodiment of the present application may be specific hardware on a device, or software or firmware installed on a device, or the like. The device provided by the embodiment of the present application has the same implementation principle and technical effect as the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments where no part of the device embodiments is mentioned. It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the foregoing systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
For another example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the job processing method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (20)

1. A method of job processing, the method comprising:
acquiring a job to be processed;
splitting the job to be processed into a plurality of job steps based on the execution phase of the job to be processed;
and respectively processing the operation steps by utilizing a plurality of different computing resources according to the computing resources required for processing each operation step to obtain a final processing result of the operation to be processed.
2. The method according to claim 1, wherein the step of obtaining the final processing result of the job to be processed by processing the job steps with a plurality of different computing resources according to the computing resources required for processing each job step comprises:
and circularly performing the following steps until a final processing result of the operation to be processed is obtained:
processing the current operation step by using the current computing resource to obtain a data set for data exchange between the current operation step and the next operation step;
determining a next computing resource according to the computing resource required for processing the next job step;
scheduling the current job step exchange dataset and the next job step to the next computing resource to process the next job step with the next computing resource based on the current job step exchange dataset.
3. The method of claim 2, wherein the step of scheduling the next job step to the next computing resource comprises:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
and scheduling the next target operation step to the next computing resource.
4. The method according to claim 2, wherein the storage mode of each job step-to-step exchange data set among job steps comprises any one of the following items:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
5. The method of claim 2, wherein the step of determining a next computing resource based on the computing resources required to process the next job step comprises:
and adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
6. The method of claim 5, wherein the historical processing results comprise historical processing time consumption and historical processing resource utilization; the step of adjusting the computing resources required for processing the next job step based on the historical processing result of the job step includes:
and when the historical processing time consumption and the historical processing resource utilization rate of the operation step exceed a preset range, changing the computing resource required for processing the next operation step, the scheduling mode of the computing resource and the data exchange mode among the operation steps so as to enable the processing time consumption and the processing resource utilization rate of the next operation step to be within the preset range.
7. The method according to claim 2, wherein the job number of the job to be processed is plural;
each of the computing resources is configured to process the job step in at least one of the jobs to be processed.
8. The method according to any one of claims 1 to 7, wherein each of the computing resources corresponds to an execution state, and the execution state is used for representing an execution state of the computing resource for processing each of the job steps.
9. The method according to claim 1, wherein the processing of the job to be processed is done by management and control of the level of the job step; the control content of the level of the job step comprises any one or more of the following items:
instructions for pause, cancel, restart, stop, and rerun of the job step.
10. A job processing apparatus, comprising:
the acquisition module is used for acquiring the operation to be processed;
the splitting module is used for splitting the job to be processed into a plurality of job steps based on the execution stage of the job to be processed;
and the processing module is used for respectively processing the operation steps by utilizing a plurality of different computing resources according to the computing resources required for processing each operation step to obtain the final processing result of the operation to be processed.
11. The apparatus of claim 10, wherein the processing module comprises:
the processing submodule is used for processing the current operation step by using the current computing resource to obtain a data set for performing data exchange between the current operation step and the next operation step;
a determining submodule, configured to determine a next computing resource according to the computing resource required for processing a next job step;
and the scheduling submodule is used for scheduling the current job step-by-step exchange data set and the next job step to the next computing resource so as to process the next job step by using the next computing resource based on the current job step-by-step exchange data set.
12. The apparatus of claim 11, wherein the scheduling submodule is specifically configured to:
for a plurality of next operation steps, determining a next target operation step to be scheduled according to the priority sequence among the plurality of next operation steps;
and scheduling the next target operation step to the next computing resource.
13. The apparatus according to claim 11, wherein the storage mode of each of the job step-to-step exchange data sets between job steps comprises any one of:
a cache mode, a message queue mode, a distributed storage mode, and a database mode.
14. The apparatus of claim 11, wherein the determination submodule is specifically configured to:
and adjusting the computing resource required for processing the next operation step based on the historical processing result of the operation step, and determining the next computing resource according to the adjustment result.
15. The apparatus of claim 14, wherein the historical processing results comprise historical processing time consumption and historical processing resource utilization; the determination submodule is further configured to:
and when the historical processing time consumption and the historical processing resource utilization rate of the operation step exceed a preset range, changing the computing resource required for processing the next operation step, the scheduling mode of the computing resource and the data exchange mode among the operation steps so as to enable the processing time consumption and the processing resource utilization rate of the next operation step to be within the preset range.
16. The apparatus according to claim 11, wherein the job number of the job to be processed is plural;
each of the computing resources is configured to process the job step in at least one of the jobs to be processed.
17. The apparatus according to any one of claims 10 to 16, wherein each of the computing resources corresponds to an execution state, and the execution state is used to indicate an execution state of the computing resource for processing each of the job steps.
18. The apparatus according to claim 10, wherein the processing of the job to be processed is performed by controlling the level of the job step; the control content of the level of the job step comprises any one or more of the following items:
instructions for pause, cancel, restart, stop, and rerun of the job step.
19. A computer device comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 9 when executing the computer program.
20. A computer readable storage medium having stored thereon machine executable instructions which, when invoked and executed by a processor, cause the processor to execute the method of any of claims 1 to 9.
CN202010319530.8A 2020-04-21 2020-04-21 Job processing method and device and computer equipment Pending CN111552547A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010319530.8A CN111552547A (en) 2020-04-21 2020-04-21 Job processing method and device and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010319530.8A CN111552547A (en) 2020-04-21 2020-04-21 Job processing method and device and computer equipment

Publications (1)

Publication Number Publication Date
CN111552547A true CN111552547A (en) 2020-08-18

Family

ID=71998327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010319530.8A Pending CN111552547A (en) 2020-04-21 2020-04-21 Job processing method and device and computer equipment

Country Status (1)

Country Link
CN (1) CN111552547A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037017A (en) * 2020-09-01 2020-12-04 中国银行股份有限公司 Method, device and equipment for determining batch processing job evaluation result

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104750549A (en) * 2015-04-13 2015-07-01 飞狐信息技术(天津)有限公司 Computational task processing device, method and system
CN105511957A (en) * 2014-09-25 2016-04-20 国际商业机器公司 Method and system for generating work alarm
CN108874520A (en) * 2018-06-06 2018-11-23 成都四方伟业软件股份有限公司 Calculation method and device
CN109298948A (en) * 2018-10-31 2019-02-01 北京国信宏数科技有限责任公司 Distributed computing method and system
CN109298940A (en) * 2018-09-28 2019-02-01 考拉征信服务有限公司 Calculation task allocating method, device, electronic equipment and computer storage medium
CN109933422A (en) * 2017-12-19 2019-06-25 北京京东尚科信息技术有限公司 Method, apparatus, medium and the electronic equipment of processing task
CN110187960A (en) * 2019-04-23 2019-08-30 广东省智能制造研究所 A kind of distributed resource scheduling method and device
CN110609749A (en) * 2019-09-06 2019-12-24 阿里巴巴集团控股有限公司 Distributed task operation method, system and equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105511957A (en) * 2014-09-25 2016-04-20 国际商业机器公司 Method and system for generating work alarm
CN104750549A (en) * 2015-04-13 2015-07-01 飞狐信息技术(天津)有限公司 Computational task processing device, method and system
CN109933422A (en) * 2017-12-19 2019-06-25 北京京东尚科信息技术有限公司 Method, apparatus, medium and the electronic equipment of processing task
CN108874520A (en) * 2018-06-06 2018-11-23 成都四方伟业软件股份有限公司 Calculation method and device
CN109298940A (en) * 2018-09-28 2019-02-01 考拉征信服务有限公司 Calculation task allocating method, device, electronic equipment and computer storage medium
CN109298948A (en) * 2018-10-31 2019-02-01 北京国信宏数科技有限责任公司 Distributed computing method and system
CN110187960A (en) * 2019-04-23 2019-08-30 广东省智能制造研究所 A kind of distributed resource scheduling method and device
CN110609749A (en) * 2019-09-06 2019-12-24 阿里巴巴集团控股有限公司 Distributed task operation method, system and equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112037017A (en) * 2020-09-01 2020-12-04 中国银行股份有限公司 Method, device and equipment for determining batch processing job evaluation result

Similar Documents

Publication Publication Date Title
US11150931B2 (en) Virtual workload migrations
US10430332B2 (en) System and method for performance tuning of garbage collection algorithms
US11455189B2 (en) Task scheduling simulation system
US9104498B2 (en) Maximizing server utilization within a datacenter
CN107832153B (en) Hadoop cluster resource self-adaptive allocation method
CN108446176A (en) A kind of method for allocating tasks, computer readable storage medium and terminal device
US20150058844A1 (en) Virtual computing resource orchestration
CN109067841B (en) Service current limiting method, system, server and storage medium based on ZooKeeper
CN108829512A (en) A kind of cloud central hardware accelerates distribution method, system and the cloud center of calculating power
US11520673B2 (en) Maintenance operations based on analysis of collected data
US20220206873A1 (en) Pre-emptive container load-balancing, auto-scaling and placement
CN114143327B (en) Cluster resource quota allocation method and device and electronic equipment
CN110096339B (en) System load-based capacity expansion and contraction configuration recommendation system and method
CN109597697B (en) Resource matching processing method and device
CN111552547A (en) Job processing method and device and computer equipment
CN104679575A (en) Control system and control method for input and output flow
US10007559B1 (en) Virtual tiering
CN116820729A (en) Offline task scheduling method and device and electronic equipment
CN111401752A (en) Flywheel energy storage frequency modulation sharing processing method, device and system and electronic equipment
CN111125070A (en) Data exchange method and platform
US20230300086A1 (en) On-demand resource capacity in a serverless function-as-a-service infrastructure
US11940895B2 (en) Methods and systems for intelligent sampling of application traces
CN106997304B (en) Input and output event processing method and device
CN111475295B (en) Software and hardware layered management method and device and computer readable storage medium
US20200117596A1 (en) A Memory Allocation Manager and Method Performed Thereby for Managing Memory Allocation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination