CN117093372A - Data batch running system, method, equipment and storage medium - Google Patents

Data batch running system, method, equipment and storage medium Download PDF

Info

Publication number
CN117093372A
CN117093372A CN202311076653.3A CN202311076653A CN117093372A CN 117093372 A CN117093372 A CN 117093372A CN 202311076653 A CN202311076653 A CN 202311076653A CN 117093372 A CN117093372 A CN 117093372A
Authority
CN
China
Prior art keywords
batch
task
data
running
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311076653.3A
Other languages
Chinese (zh)
Inventor
杨晨
李杨
周锋
曹闯
杨得力
孙喜锋
李响
廖艺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Henan Zhongyuan Consumption Finance Co ltd
Original Assignee
Henan Zhongyuan Consumption Finance Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Henan Zhongyuan Consumption Finance Co ltd filed Critical Henan Zhongyuan Consumption Finance Co ltd
Priority to CN202311076653.3A priority Critical patent/CN117093372A/en
Publication of CN117093372A publication Critical patent/CN117093372A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a data running batch system, a method, equipment and a storage medium, which relate to the technical field of data running batch and comprise the following steps: the task judging interface is used for judging whether the current target running batch task has a downstream dependent task or not; the first data batch running module is used for executing a batch running process of corresponding metadata when the target batch running task does not have a downstream dependent task, and marking the batch running process as a preliminary batch running completion; the judging module is used for judging whether a data query request is acquired or not; the second data batch running module is used for executing the preliminary batch running to complete the corresponding target batch running task through a preset buffer queue when the data query request is acquired so as to screen target data from the corresponding batch running data; and the third data batch running module is used for running batch to finish the corresponding target batch running task on the basis of the preset data batch running time period when the data query request is not acquired. Thus, the condition of resource contention in the peak period of running batch is lightened, and the running batch efficiency is improved.

Description

Data batch running system, method, equipment and storage medium
Technical Field
The present application relates to the field of data running and batching technologies, and in particular, to a data running and batching system, method, device and storage medium.
Background
With the advent of the digital age, corporate data volumes are multiplying each year. The data warehouse has more and more tasks to run at night every day, and resources are consumed more and more. The increase of the tasks exceeds the bottleneck of resources, and the newly added important tasks hardly run out in a specified time; many original tasks consume a large amount of resources to run each day, but the demander does not have to use new data each day, which causes resource waste. The methods commonly used at present are: on one hand, aiming at the situation that the data is fast in growth and the resources are limited, a server is added or batch running logic is optimized; the former is to increase run queue resources, and the latter is to reduce single task resource usage. However, the increase of server resources is limited, and the optimization logic consumes a great deal of human resources, each item means huge cost investment; and by adding the server, the whole resource is ensured to be larger than the running batch. However, the demand of batch running still cannot be met in the peak period, so that part of the operation cannot be completed in the set time. On the other hand, for the requirement that running lot is completed within a prescribed time, a common way is precise time control and dependent control. The running batch is guaranteed to be started immediately when the time and the dependence are completed, but at the moment that the resource race is more violent, the guarantee is invalid, and the job still cannot complete the task within the specified time.
Therefore, how to reasonably utilize the running queue resources and improve the data running efficiency is a problem to be solved in the field.
Disclosure of Invention
Accordingly, the present application is directed to a data batch system, method, apparatus and storage medium, which can reduce the resources competing in the peak period of the data batch and improve the data batch efficiency. The specific scheme is as follows:
in a first aspect, the present application provides a data run-out system comprising:
the task judging interface is used for judging whether the current target running batch task has a downstream dependent task or not;
the first data batch module is used for executing a batch process of corresponding metadata based on the target batch task when the target batch task does not have a downstream dependent task, and marking the target batch task as a preliminary batch completion;
the judging module is used for judging whether a data query request aiming at the target running batch task is acquired or not;
the second data batch module is used for executing the preliminary batch to complete the corresponding target batch task through a preset buffer queue when a data query request aiming at the target batch task is acquired, so as to screen out target data corresponding to the data query request from the corresponding batch data;
and the third data batch module is used for performing batch on the preliminary batch to finish the corresponding target batch task based on a preset data batch time period when the data query request for the target batch task is not acquired, so as to finish the batch process of the related data.
Optionally, the task judging interface includes:
the dependent task acquisition unit is used for acquiring a data batch running task with a downstream dependent task through a preset human-computer interaction interface;
the list generation unit is used for generating a task list according to the data batch running task to obtain a target dependent task list;
and the task judging unit is used for judging whether the target running task exists in the target dependent task list so as to judge whether the target running task exists in a downstream dependent task.
Optionally, the first data batch module includes:
the task identification unit is used for identifying whether the target running batch task is marked as a non-timely view type or not when the target running batch task does not have a downstream dependent task; the non-timely view type is a task type marked through a preset interface;
and the first data running batch unit is used for executing the running batch process of the corresponding metadata based on the target running batch task when the target running batch task is marked as a non-timely view type, and marking the target running batch task as a preliminary running batch completion.
Optionally, the second data batch module includes:
the query request acquisition unit is used for acquiring a data query request aiming at the target running batch task;
the request judging unit is used for judging whether the target running batch task corresponding to the data query request is marked as preliminary running batch completion or not;
the second data batch running unit is used for executing the target batch running task through a preset buffer queue when the target batch running task is marked as the preliminary batch running completion, so as to obtain corresponding first batch running data;
and the first data query unit is used for screening target data corresponding to the data query request from the first running data.
Optionally, the system further comprises:
the third data batch unit is used for executing the target batch task to obtain second batch data when the target batch task has a downstream dependent task so as to complete a batch process of related data;
and the fourth data batch unit is used for executing the target batch task to obtain third batch data so as to complete the batch process of the related data when the target batch task does not have a downstream dependent task and the target batch task is not marked as a non-timely view type.
Optionally, the system further comprises:
and the second data query unit is used for screening out target data corresponding to the data query request from the second running batch data or the third running batch data.
Optionally, the third data batch module includes:
the task state identification unit is used for identifying whether the target running batch task is in a state marked as preliminary running batch completion currently or not based on a preset data running batch time period under the condition that a data query request for the target running batch task is not acquired;
and the fifth data batch unit is used for executing the target batch task when the target batch task is in a state marked as the completion of the preliminary batch so as to complete the batch process of the related data.
In a second aspect, the present application provides a data batch method, including:
judging whether a downstream dependent task exists in the current target running batch task or not;
when the target running batch task does not have a downstream dependent task, executing a running batch process of corresponding metadata based on the target running batch task, and marking the target running batch task as a preliminary running batch completion;
judging whether a data query request aiming at the target running batch task is acquired or not;
when a data query request aiming at the target running batch task is obtained, executing the preliminary running batch to complete the corresponding target running batch task through a preset buffer queue so as to screen target data corresponding to the data query request from corresponding running batch data;
and when the data query request for the target running batch task is not acquired, the preliminary running batch is completed according to the preset data running batch time period, so that the running batch process of the related data is completed.
In a third aspect, the present application provides an electronic device, comprising:
a memory for storing a computer program;
and a processor for executing the computer program to implement the data batch method as described above.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which when executed by a processor implements a data batching method as described above.
Therefore, the task judging interface is used for judging whether the current target running batch task has a downstream dependent task or not; the first data batch module is used for executing a batch process of corresponding metadata based on the target batch task when the target batch task does not have a downstream dependent task, and marking the target batch task as a preliminary batch completion; the judging module is used for judging whether a data query request aiming at the target running batch task is acquired or not; the second data batch module is used for executing the preliminary batch to complete the corresponding target batch task through a preset buffer queue when a data query request aiming at the target batch task is acquired, so as to screen out target data corresponding to the data query request from the corresponding batch data; and the third data batch module is used for performing batch on the preliminary batch to finish the corresponding target batch task based on a preset data batch time period when the data query request for the target batch task is not acquired, so as to finish the batch process of the related data. In this way, metadata batch can be carried out according to the actual demand of the target batch task, and when the related data query request is subsequently received, the target batch task is executed through the preset buffer queue, so that the resource preemption condition of the peak period of the data batch can be reduced, and the execution efficiency of other batch tasks is improved; and the target batch task can be subjected to batch running in relatively idle time based on the preset data batch running time period, batch running queue resources can be reasonably distributed, and the data batch running efficiency is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a data batch system according to the present disclosure;
FIG. 2 is a flow chart of a data batch method of the present application;
FIG. 3 is a flow chart of a particular data run-out method of the present disclosure;
FIG. 4 is a flow chart of another embodiment of a data batch method of the present disclosure;
FIG. 5 is a flowchart of a specific running lot data query method according to the present application;
FIG. 6 is a flow chart of a specific data replenishment run method of the present disclosure;
FIG. 7 is a schematic diagram of a data run-to-run resource allocation according to the present disclosure;
fig. 8 is a block diagram of an electronic device according to the present disclosure.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Referring to fig. 1, an embodiment of the present application discloses a data batch system, including:
the task judging interface 11 is configured to judge whether a downstream dependent task exists in the current target running task.
In the application, when data running is carried out, the task judgment interface judges whether the current target running task has a downstream task or not. It can be understood that the data corresponding to the running batch task with the downstream task has higher utilization rate, and correspondingly, the data corresponding to the running batch task without the downstream dependent task is probably not used in a period of time; therefore, the application can distinguish the current target running batch task according to whether the downstream dependent task exists or not. In a specific embodiment, the task determination interface may include: the dependent task acquisition unit is used for acquiring a data batch running task with a downstream dependent task through a preset human-computer interaction interface; the list generation unit is used for generating a task list according to the data batch running task to obtain a target dependent task list; and the task judging unit is used for judging whether the target running task exists in the target dependent task list so as to judge whether the target running task exists in a downstream dependent task. Specifically, the dependent task acquisition unit can acquire the data running batch task with the downstream dependent task through a preset human-computer interaction interface, and then the data running batch task with the downstream dependent task can be integrated and generated into a task list through the list generation unit to serve as a target dependent task list; then when the data running batch task is needed to be executed subsequently, judging whether the target running batch task which is needed to be executed currently exists in the target dependent task list or not through a task judging unit, if so, the target running batch task has a downstream dependent task, and the fact that relevant running batch data has a very timely use requirement is indicated, and running batch needs to be completed timely; correspondingly, if the target batch task does not have downstream dependence, the relevant batch data is not required to be used in time.
The first data batch module 12 is configured to execute a batch process of corresponding metadata based on the target batch task when the target batch task does not have a downstream dependent task, and mark the target batch task as a preliminary batch completion.
Correspondingly, when the target running batch task does not have the downstream dependent task, the running batch process of metadata corresponding to the target running batch task can be executed through the first data running batch module, and the target running batch task is marked as the primary running batch completion. It can be understood that when the target running batch task does not have a downstream dependent task, it is indicated that the running batch data corresponding to the target running batch task has low use requirement or low importance, and at this time, only the metadata of the target running batch task can be run, so that the saved running batch queue resources are reserved for other running batch tasks. In a specific embodiment, the first data batch module may include: the task identification unit is used for identifying whether the target running batch task is marked as a non-timely view type or not when the target running batch task does not have a downstream dependent task; the non-timely view type is a task type marked through a preset interface; and the first data running batch unit is used for executing the running batch process of the corresponding metadata based on the target running batch task when the target running batch task is marked as a non-timely view type, and marking the target running batch task as a preliminary running batch completion. Specifically, even if the data batch running task does not have a downstream dependent task, there is a high use requirement, for example, there is a high requirement on the feedback time of the query result of the data query request, at this time, the corresponding batch running task can be marked as a timely checking type, and timely batch running is represented to meet the requirement of the feedback time; correspondingly, for batch running tasks with low use demands, the batch running tasks can be marked as non-timely checking types; the staff can mark the data batch task with low use requirement through a preset interface. And then the type of the current target running batch task can be identified through the task identification unit, if the target running batch task is marked as a non-timely checking type, the running batch process of metadata corresponding to the target running batch task can be executed through the first data running batch unit, and meanwhile, the target running batch task after the metadata running batch process is marked as a preliminary running batch completion.
And the judging module 13 is used for judging whether a data query request aiming at the target running batch task is acquired.
It should be noted that, the data obtained after running is not necessarily queried and used, and because the application includes the data running task only for metadata running, it is necessary to determine whether the data query request for the running tasks marked as the completion of the preliminary running (the corresponding data of the running required) is received through the determination module.
And the second data running batch module 14 is configured to execute the preliminary running batch to complete the corresponding target running batch task through a preset buffer queue when a data query request for the target running batch task is acquired, so as to screen target data corresponding to the data query request from the corresponding running batch data.
In the application, when the data query request for the target running batch task is obtained, the target running batch task is not executed, and only relevant metadata is run (marked as preliminary running batch completion), so that the target running batch task is required to be subjected to real data running batch through a preset buffer queue, real running batch data can be obtained, and then corresponding target data can be screened out from the running batch data according to the data query request. In a specific embodiment, the second data batch module may include: the query request acquisition unit is used for acquiring a data query request aiming at the target running batch task; the request judging unit is used for judging whether the target running batch task corresponding to the data query request is marked as preliminary running batch completion or not; the second data batch running unit is used for executing the target batch running task through a preset buffer queue when the target batch running task is marked as the preliminary batch running completion, so as to obtain corresponding first batch running data; and the first data query unit is used for screening target data corresponding to the data query request from the first running data. Specifically, a data query request for a target running batch task can be acquired through a query request acquisition unit, and then a request judgment unit judges whether the target running batch task corresponding to the data query request is marked as a preliminary running batch completion or not; it can be understood that in a working day, there may be multiple query requests for a part of data, when a data query is performed on a target running batch task (corresponding to the data to be run) marked as the completion of the preliminary running batch for the first time, the target running batch task may be executed through a preset cache queue, so that first running batch data corresponding to the target running batch task may be obtained, and the data running batch process is completed; it is to be appreciated that the pre-set cache queue can be a separate cache queue that performs only target run-to-run tasks that are marked as preliminary run-to-run completion. Therefore, when the data query request is received, the corresponding target running task can be timely executed, and the target data corresponding to the query request is obtained.
In a specific embodiment, the method may further include: the third data batch unit is used for executing the target batch task to obtain second batch data when the target batch task has a downstream dependent task so as to complete a batch process of related data; and the fourth data batch unit is used for executing the target batch task to obtain third batch data so as to complete the batch process of the related data when the target batch task does not have a downstream dependent task and the target batch task is not marked as a non-timely view type. Specifically, when the data running is performed, if the current target running task has a downstream dependent task, the target running task can be directly executed through the third data running unit to obtain second running data, so that the running process of the target running task is completed. Correspondingly, if the current target running batch task does not have a downstream dependent task, but the target running batch task is not marked as a non-timely view type or marked as a timely view type, the target running batch task can be executed through a fourth data running batch unit to obtain third running batch data, and the running batch process of the target running batch task is completed. It will be appreciated that in yet another specific embodiment, it may further comprise: and the second data query unit is used for screening out target data corresponding to the data query request from the second running batch data or the third running batch data. Specifically, when a data query request for the second running batch data or the third running batch data is subsequent, the corresponding target data can be directly screened out.
And the third data batch module 15 is configured to perform batch processing on the preliminary batch to complete the corresponding target batch task based on a preset data batch time period when the data query request for the target batch task is not acquired, so as to complete a batch process of related data.
In the present application, it can be understood that, for the data running task marked as the preliminary running completion in the above step and not used (queried) in a preset data running time period, running needs to be uniformly performed by the third data running module when the period is over (or idle), so as to ensure continuity between all running data. In a specific embodiment, the third data batch module may include: the task state identification unit is used for identifying whether the target running batch task is in a state marked as preliminary running batch completion currently or not based on a preset data running batch time period under the condition that a data query request for the target running batch task is not acquired; and the fifth data batch unit is used for executing the target batch task when the target batch task is in a state marked as the completion of the preliminary batch so as to complete the batch process of the related data. Specifically, when the data query request for the target running batch task is not acquired, all the data running batch tasks currently completed in the preliminary running batch can be identified based on the preset data running batch time period through the task state identification unit, and then the running batch tasks of the data are executed through the fifth data running batch unit, so that the running batch process of the related data can be completed.
Therefore, the application can only perform the running process of corresponding metadata on the current target running task with the downstream dependent task, save the queue resources of the running peak period, and timely run the running through the preset buffer queue when the running data corresponding to the target running task is to be used, so that the normal use of the running data is not affected; and then, according to the preset data batch time period, the target batch task marked as the completion of the preliminary batch can be executed in idle time, so that the continuity of data is ensured.
As shown in fig. 2, an embodiment of the present application discloses a data batch running method, which includes:
and S11, judging whether the current target running batch task has a downstream dependent task or not.
In the application, when data running is carried out, the actual use requirements of different running tasks are required to be distinguished, and whether the current target running task needs to run in time can be judged according to whether a downstream dependent task exists. In other words, as the result data of the batch task of numerous data are not necessarily used in the same day, for example, the requirement of report forms is met, if the report forms catch up with the rest day, no business is likely to check the data, and the requirement of the data on timeliness of the query feedback result is not high when the data need to be checked; thus, the actual use requirements of different data batch running tasks can be distinguished.
And step S12, executing a running batch process of corresponding metadata based on the target running batch task when the target running batch task does not have a downstream dependent task, and marking the target running batch task as a preliminary running batch completion.
In the application, when the current target running batch task does not have a downstream dependent task, compared with the data running batch task with the downstream dependent task, the importance of the current target running batch task is lower, so that the running batch can be only carried out on the metadata corresponding to the target running batch task in the running batch peak period, and the target running batch task is marked as the primary running batch completion.
And step S13, judging whether a data query request aiming at the target running batch task is acquired.
Further, in terms of business use of batch data, in most cases, data corresponding to a target batch task marked as a preliminary batch completion after processing according to the foregoing steps is not checked; accordingly, a data query request for data corresponding to the target running batch task may also occur, and in this case, it may be determined whether the obtained data query request is for the target running batch task marked as the preliminary query request at any time.
Step S14, when a data query request for the target running batch task is obtained, executing the preliminary running batch to complete the corresponding target running batch task through a preset buffer queue so as to screen target data corresponding to the data query request from corresponding running batch data.
In the application, if the target running batch task marked as the primary running batch is obtained, the target running batch task only completes the running batch process of metadata and does not run the real data, so the target running batch task can be executed through a preset buffer queue to complete the running batch process of related data, corresponding running batch data is obtained, and then the corresponding target data can be screened out from the running batch data based on the data query request.
And step S15, when the data query request for the target running batch task is not acquired, the corresponding target running batch task is run in the preliminary running batch based on a preset data running batch time period, so that the running batch process of the related data is completed.
Further, if the service usage of the running lot data does not involve the use of the data running lot tasks marked as the preliminary running lot completion, that is, the data query request for the target running lot task is not obtained, the running lot of real data can be uniformly performed on the data running lot tasks marked as the preliminary running lot completion according to the preset data running lot time period, so that the running lot process of the related tasks is completed.
In a specific embodiment, as shown in fig. 3, when an upstream dependent task is completed and a batch running condition of a target task is triggered, firstly judging whether the target task has a downstream dependent task, if yes, directly running real data according to the target task, and directly completing the batch running task; accordingly, if the target task does not have a downstream dependent task, only the target task may be "lazy" running (only metadata running), and recorded as "lazy" (adding an identification of preliminary running completion). And then, before the 18 points on the same day (according to a preset data batch time period), if a data query request aiming at the target task is obtained, namely, whether the data to be batched corresponding to the target task is to be used or not, a batch query state interface can be triggered during use, if the target task is in a 'lazy' state, a batch process of real data is carried out on the target task (through a preset cache queue), and query data corresponding to the data query request is returned according to the batch data. Correspondingly, if the data of the running batch needed corresponding to the target task is not used before the 18 points on the same day, idle running batch compensation can be performed at the 18 points, namely real data running batch is performed on the target task marked as the completion of the preliminary running batch so as to compensate the related data of the running batch needed. It should be noted that the efficiency of target task running batch should be able to meet the service requirement by presetting a buffer queue (a resource queue that is idle at any time).
In another specific embodiment, as shown in fig. 4, if the running time and the dependency condition (the upstream dependency task) of the target task are currently met, determining whether the target task has a downstream dependency task, and if so, directly running the real data; correspondingly, if the target task does not have a downstream dependent task, whether the target task is marked as a non-timely checking type can be further judged, if not, real data running is directly carried out, if yes, lazy running is carried out on the target task, running is carried out only on related metadata, and real data running is skipped.
In yet another specific embodiment, as shown in FIG. 5, the present application uses UDF (User-defined functions) encapsulation to invoke run-to-run query interfaces; when querying data by utilizing SQL (StructuredQuery Language ), triggering running batch state query by utilizing UDF, firstly judging whether a target task corresponding to the data to be queried is in a 'lazy' running batch state, if not, returning a data result corresponding to the current query request according to the current result set (storing all the data of the running batch); correspondingly, if the target task corresponding to the data to be queried is in a lazy batch running state, submitting the task to a cache queue (preset cache queue) for data batch running, and prompting to wait for a query result; and writing the data completed by batch running into a result set to screen out the data result corresponding to the current query request from the result set.
Further, as shown in fig. 6, the present application may perform bottom support compensation on a data batch task according to a preset data batch time period, specifically, may traverse all batch tasks at 18 points every day to check whether there is a "lazy" batch job, if so, may further determine whether the real batch process of the batch task has been completed before, if not, it indicates that the batch task at the current time point is only in a state of completing metadata batch, and does not perform the batch process of real data; at this time, the task needs to be batched in idle time to complement all data of the data batched task.
As shown in FIG. 7, the present application performs data run tasks at night on the run queue resources and run time schedule, and it is understood that some of these tasks are "lazy" runs; during daytime, high-speed resources can be reserved for timely running the tasks of lazy running (under the condition of query requirement); data supplementation runs can then be performed during the evening idle time for "lazy" runs (no query demand during the day). Therefore, the data batch queue resource can be reasonably utilized, and the data batch efficiency can be improved.
Therefore, the application can perform lazy running according to different demands of the data running task or real data running, thus the preemptive pressure of the resource queue can be relieved in the peak period of the data running, the real data running can be performed on the data running task of the lazy running by using the preset buffer queue, the normal use of the running data can not be influenced, the queue resources of the data running can be reasonably utilized, and the efficiency of the data running can be improved under the condition of not influencing the use of the running data.
Further, the embodiment of the present application further discloses an electronic device, and fig. 8 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the diagram is not to be considered as any limitation on the scope of use of the present application.
Fig. 8 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is configured to store a computer program that is loaded and executed by the processor 21 to implement the relevant steps of the data batch method disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the data batch method performed by the electronic device 20 disclosed in any of the previous embodiments.
Further, the application also discloses a computer readable storage medium for storing a computer program; wherein the computer program when executed by the processor implements the data run method disclosed above. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has outlined rather broadly the more detailed description of the application in order that the detailed description of the application that follows may be better understood, and in order that the present principles and embodiments may be better understood; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. A data batching system, comprising:
the task judging interface is used for judging whether the current target running batch task has a downstream dependent task or not;
the first data batch module is used for executing a batch process of corresponding metadata based on the target batch task when the target batch task does not have a downstream dependent task, and marking the target batch task as a preliminary batch completion;
the judging module is used for judging whether a data query request aiming at the target running batch task is acquired or not;
the second data batch module is used for executing the preliminary batch to complete the corresponding target batch task through a preset buffer queue when a data query request aiming at the target batch task is acquired, so as to screen out target data corresponding to the data query request from the corresponding batch data;
and the third data batch module is used for performing batch on the preliminary batch to finish the corresponding target batch task based on a preset data batch time period when the data query request for the target batch task is not acquired, so as to finish the batch process of the related data.
2. The data batch system of claim 1, wherein the task determination interface comprises:
the dependent task acquisition unit is used for acquiring a data batch running task with a downstream dependent task through a preset human-computer interaction interface;
the list generation unit is used for generating a task list according to the data batch running task to obtain a target dependent task list;
and the task judging unit is used for judging whether the target running task exists in the target dependent task list so as to judge whether the target running task exists in a downstream dependent task.
3. The data run batch system of claim 1, wherein the first data run batch module comprises:
the task identification unit is used for identifying whether the target running batch task is marked as a non-timely view type or not when the target running batch task does not have a downstream dependent task; the non-timely view type is a task type marked through a preset interface;
and the first data running batch unit is used for executing the running batch process of the corresponding metadata based on the target running batch task when the target running batch task is marked as a non-timely view type, and marking the target running batch task as a preliminary running batch completion.
4. The data run batch system of claim 1, wherein the second data run batch module comprises:
the query request acquisition unit is used for acquiring a data query request aiming at the target running batch task;
the request judging unit is used for judging whether the target running batch task corresponding to the data query request is marked as preliminary running batch completion or not;
the second data batch running unit is used for executing the target batch running task through a preset buffer queue when the target batch running task is marked as the preliminary batch running completion, so as to obtain corresponding first batch running data;
and the first data query unit is used for screening target data corresponding to the data query request from the first running data.
5. The data run-on system of claim 3, further comprising:
the third data batch unit is used for executing the target batch task to obtain second batch data when the target batch task has a downstream dependent task so as to complete a batch process of related data;
and the fourth data batch unit is used for executing the target batch task to obtain third batch data so as to complete the batch process of the related data when the target batch task does not have a downstream dependent task and the target batch task is not marked as a non-timely view type.
6. The data run batch system of claim 5, further comprising:
and the second data query unit is used for screening out target data corresponding to the data query request from the second running batch data or the third running batch data.
7. The data batch system of any one of claims 1 to 6, wherein the third data batch module comprises:
the task state identification unit is used for identifying whether the target running batch task is in a state marked as preliminary running batch completion currently or not based on a preset data running batch time period under the condition that a data query request for the target running batch task is not acquired;
and the fifth data batch unit is used for executing the target batch task when the target batch task is in a state marked as the completion of the preliminary batch so as to complete the batch process of the related data.
8. A method of data batching, comprising:
judging whether a downstream dependent task exists in the current target running batch task or not;
when the target running batch task does not have a downstream dependent task, executing a running batch process of corresponding metadata based on the target running batch task, and marking the target running batch task as a preliminary running batch completion;
judging whether a data query request aiming at the target running batch task is acquired or not;
when a data query request aiming at the target running batch task is obtained, executing the preliminary running batch to complete the corresponding target running batch task through a preset buffer queue so as to screen target data corresponding to the data query request from corresponding running batch data;
and when the data query request for the target running batch task is not acquired, the preliminary running batch is completed according to the preset data running batch time period, so that the running batch process of the related data is completed.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the data run method of claim 8.
10. A computer readable storage medium storing a computer program which when executed by a processor implements the data batching method according to claim 8.
CN202311076653.3A 2023-08-25 2023-08-25 Data batch running system, method, equipment and storage medium Pending CN117093372A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311076653.3A CN117093372A (en) 2023-08-25 2023-08-25 Data batch running system, method, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311076653.3A CN117093372A (en) 2023-08-25 2023-08-25 Data batch running system, method, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117093372A true CN117093372A (en) 2023-11-21

Family

ID=88776638

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311076653.3A Pending CN117093372A (en) 2023-08-25 2023-08-25 Data batch running system, method, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117093372A (en)

Similar Documents

Publication Publication Date Title
US8352517B2 (en) Infrastructure for spilling pages to a persistent store
US20190213080A1 (en) Implementing Clone Snapshots In A Distributed Storage System
US9665391B2 (en) Automated transaction tuning in application servers
US8583608B2 (en) Maximum allowable runtime query governor
CN110908641B (en) Visualization-based stream computing platform, method, device and storage medium
CN103324534A (en) Operation scheduling method and operation scheduler
CN108459913B (en) Data parallel processing method and device and server
WO2021093365A1 (en) Gpu video memory management control method and related device
CN110134738A (en) Distributed memory system resource predictor method, device
CN110807145A (en) Query engine acquisition method, device and computer-readable storage medium
CN107798111B (en) Method for exporting data in large batch in distributed environment
CN113157411B (en) Celery-based reliable configurable task system and device
WO2019029721A1 (en) Task scheduling method, apparatus and device, and storage medium
CN117093372A (en) Data batch running system, method, equipment and storage medium
US20220164352A1 (en) Optimal query scheduling according to data freshness requirements
CN112099937A (en) Resource management method and device
US11625400B2 (en) Optimal query scheduling for resource utilization optimization
US11748203B2 (en) Multi-role application orchestration in a distributed storage system
US7721287B2 (en) Organizing transmission of repository data
US20160063019A1 (en) Script converter
CN110704489A (en) Database query method, device, equipment and computer storage medium
CN114780267B (en) Interface calling method, device, equipment and storage medium
US6662057B1 (en) Method and device for controlling processes in a computer system
CN117215732A (en) Task scheduling method, device, system and related equipment
CN117992213A (en) Resource processing method, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination